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TRANSPORTERS AND ION CHANNELS 
TECHNICAL FIELD 

This invention relates to nucleic acid and amino acid sequences of transporters and ion channels 
5 and to the use of these sequences in the diagnosis, treatment, and prevention of transport, neurological, 
muscle, and immunological disorders, and in the assessment of the effects of exogenous compounds on 
the expression of nucleic acid and amino acid sequences of transporters and ion channels. 

BACKGROUND OF THE INVENTION 

10 Eukaryotic cells are surrounded and subdivided into functionally distinct organelles by 

hydrophobic lipid bilayer membranes which are highly impermeable to most polar molecules. Cells and 
organelles require transport proteins to import and export essential nutrients and metal ions including 
K + , NH 4 \ Pi, S0 4 2 ', sugars, and vitamins, as well as various metabolic waste products.^ Transport 
proteins also play roles in antibiotic resistance, toxin secretion, ion balance, synaptic neurotransmission, 

15 kidney function, intestinal absorption, tumor growth, and other diverse cell functions (Griffith, J. and C. 
Sansom (1998) The Transporter Facts Book , Academic Press, San Diego CA, pp. 3-29). Transport 
can occur by a passive concentration-dependent mechanism, or can be linked to an energy source such 
as ATP hydrolysis or an ion gradient Proteins that function in transport include carrier proteins, which 
bind to a specific solute and undergo a conformational change that translocates the bound solute across 

20 the membrane, and channel proteins, which form hydrophilic pores that allow specific solutes to diffuse 
through the membrane down an electrochemical solute gradient. 

Carrier proteins which transport a single solute from one side of the membrane to the other 
are called uniporters. In contrast, coupled transporters link the transfer of one solute with 
simultaneous or sequential transfer of a second solute, either in the same direction (symport) or in the 

25 opposite direction (antiport). For example, intestinal and kidney epithelium contains a variety of 
symporter systems driven by the sodium gradient that exists across the plasma membrane. Sodium 
moves into the cell down its electrochemical gradient and brings the solute into the cell with it. The 
sodium gradient that provides the driving force for solute uptake is maintained by the ubiquitous 
Na + /K + ATPase system. Sodium-coupled transporters include the mammalian glucose transporter 

30 (SGLT1), iodide transporter (NIS), and multivitamin transporter (SMVT). All three transporters have 
twelve putative transmembrane segments, extracellular glycosylation sites, and cytoplasmically- 
oriented N- and C-termini. NIS plays a crucial role in the evaluation, diagnosis, and treatment of 
various thyroid pathologies because it is the molecular basis for radioiodide thyroid-imaging 
techniques and for specific targeting of radioisotopes to the thyroid gland (Levy, O. et al. (1997) Proc. 
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Natl. Acad. Sci. USA 94:5568-5573). SMVT is expressed in the intestinal mucosa, kidney, and 
placenta, and is implicated in the transport of the water-soluble vitamins, e.g., biotin and pantothenate 
(Prasad, P.D. et al. (1998) J. Biol. Chem. 273:7501-7506). 

One of the largest families of transporters is the major facilitator superfamily (MFS), also 
5 called the uniporter-symporter-antiporter family. MFS transporters are single polypeptide carriers 
that transport small solutes in response to ion gradients. Members of the MFS are found in all classes 
of living organisms, and include transporters for sugars, oligosaccharides, phosphates, nitrates, 
nucleosides, monocarboxylates, and drugs. MFS transporters found in eukaryotes all have a structure 
comprising 12 transmembrane segments (Pao, S.S. et al. (1998) Microbiol. Molec. Biol. Rev. 62:1- 

10 34). The largest family of MFS transporters is the sugar transporter family, which includes the seven 
glucose transporters (GLUT1-GLUT7) found in humans that are required for the transport of glucose 
and other hexose sugars. These glucose transpprt.proteins have unique tissue distributions 1 and 
physiological functions. GLUT1 provides many cell types with their basal glucose requirements and 
transports glucose across epithelial and endothelial barrier tissues; GLUT2 facilitates glucose uptake 

15 or efflux from the liver; GLUT3 regulates glucose supply to neurons; GLUT4 is responsible for 
insulin-regulated glucose disposal; and GLUT5 regulates fructose uptake into skeletal muscle. 
Defects in glucose transporters are involved in a recently identified neurological syndrome causing 
infantile seizures and developmental delay, as well as glycogen storage disease, Fanconi-Bickel 
syndrome, and non-insulin-dependent diabetes mellitus (MuecMer, M. (1994) Eur. J. Biochem. 

20 219:713-725; Longo, N. and L.J. Elsas (1998) Adv. Pediatr. 45:293-313). 

Monocarboxylate anion transporters are proton-coupled symporters with a broad substrate 
specificity that includes L-lactate, pyruvate, and the ketone bodies acetate, acetoacetate, and 
beta-hydroxybutyrate. At least seven isoforms have been identified to date. The isoforms are predicted 
to have twelve transmembrane (TM) helical domains with a large intracellular loop between TM6 and 

25 TM7, and play a critical role in maintaining intracellular pH by removing the protons that are produced 
stoichiometrically with lactate during glycolysis. The best characterized H + -monocarboxylate 
transporter is that of the erythrocyte membrane, which transports L-lactate and a wide range of other 
aliphatic monocarboxylates. Other cells possess HMinked monocarboxylate transporters with differing 
substrate and inhibitor selectivities. In particular, cardiac muscle and tumor cells have transporters that 

30 differ in their K m values for certain substrates, including stereoselectivity for L- over D-lactate, and in 
their sensitivity to inhibitors. There are Na + -monocarboxylate cotransporters on the luminal surface of 
intestinal and kidney epithelia, which allow the uptake of lactate, pyruvate, and ketone bodies in these 
tissues. In addition, there are specific and selective transporters for organic cations and organic anions 
in organs including the kidney, intestine and liver. Organic anion transporters are selective for 
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hydrophobic, charged molecules with electron-attracting side groups. Organic cation transporters, such 
as the ammonium transporter, mediate the secretion of a variety of drugs and endogenous metabolites, 
and contribute to the maintenance of intercellular pH (Poole, R.C. and A.P. Halestrap (1993) Am J. 
Physiol. 264:C761-C782; Price, N.T. et al. (1998) Biochem. J. 329:321-328; and Martinelle, K. and I. 
5 Haggstrom (1993) J. Biotechnol. 30:339-350). 

ATP-binding cassette (ABC) transporters are members of a superfamily of membrane proteins 
that transport substances ranging from small molecules such as ions, sugars, amino acids, peptides, and 
phospholipids, to lipopeptides, large proteins, and complex hydrophobic drugs. ABC transporters 
consist of four modules: two nucleotide-binding domains (NBD), which hydrolyze ATP to supply the 

10 energy required for transport, and two membrane-spanning domains (MSD), each containing six 

putative transmembrane segments. These four modules may be encoded by a single gene, as is the case 
for the cystic fibrosis transmembrane regulator (CFTR), or by separate genes. When encoded by 
separate genes, each gene product contains a single NBD and MSD. These "half-molecules" form 
homo- and heterodimers, such as Tapl and Tap2, the endoplasmic reticulum-based major 

15 histocompatibility (MHC) peptide transport system. Several genetic diseases are attributed to defects in 
ABC transporters, such as the following diseases and their corresponding proteins: cystic fibrosis 
(CFTR, an ion channel), adrenoleukodystrophy (adrenoleukodystrophy protein, ALDP), Zellweger 
syndrome (peroxisomal membrane protein-70, PMP70), and hyperinsulinemic hypoglycemia 
(sulfonylurea receptor, SUR). Overexpression of the multidrug resistance (MDR) protein, another 

20 ABC transporter, in human cancer cells makes the cells resistant to a variety of cytotoxic drugs used in 
chemotherapy (Taglicht, D. and S. Michaelis (1998) Meth. Enzymol. 292:130-162). 

A number of metal ions such as iron, zinc, copper, cobalt, manganese, molybdenum, selenium, 
nickel, and chromium are important as cofactors for a number of enzymes. For example, copper is 
involved in hemoglobin synthesis, connective tissue metabolism, and bone development, by acting as a 

25 cofactor in oxidoreductases such as superoxide dismutase, ferroxidase (ceruloplasmin), and lysyl 

oxidase. Copper and other metal ions must be provided in the diet, and are absorbed by transporters in 
the gastrointestinal tract. Plasma proteins transport the metal ions to the liver and other target organs, 
where specific transporters move the ions into cells and cellular organelles as needed. Imbalances in 
metal ion metabolism have been associated with a number of disease states (Danks, D.M. (1986) J. 

30 Med. Genet. 23:99-106). 

Transport of fatty acids across the plasma membrane can occur by diffusion, a high capacity, 
low affinity process. However, under normal physiological conditions a significant fraction of fatty 
acid transport appears to occur via a high affinity, low capacity protein-mediated transport process. 
Fatty acid transport protein (FATP), an integral membrane protein with four transmembrane segments, 
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is expressed in tissues exhibiting high levels of plasma membrane fatty acid flux, such as muscle, heart, 
and adipose. Expression of FATP is upregulated in 3T3-L1 cells during adipose conversion, and 
expression in COS7 fibroblasts elevates uptake of long-chain fatty acids (Hui, T.Y. et al. (1998) J. 
Biol. Chem. 273:27420-27429). 
5 Mitochondrial carrier proteins are transmembrane-spanning proteins which transport ions and 

charged metabolites between the cytosol and the mitochondrial matrix. Examples include the ADP, 
ATP carrier protein; the 2-oxoglutarate/malate carrier; the phosphate carrier protein; the pyruvate 
carrier; the dicarboxylate carrier which transports malate, succinate, fumarate, and phosphate; the 
tricarboxylate carrier which transports citrate and malate; and the Grave's disease carrier protein, a 

10 protein recognized by IgG in patients with active Grave's disease, an autoimmune disorder resulting in 
hyperthyroidism. Proteins in this family consist of three tandem repeats of an approximately 100 amino 
acid domain, each of which contains two transmembrane regions (Stryer, L. (1995) Biochemistry . W.H. 
Freeman and Company, New York NY, p. 551; PROSITE PDOC00189 Mitochondrial energy transfer 
proteins signature; Online Mendelian Inheritance in Man (OMIM) *275000 Graves Disease). 

15 This class of transporters also includes the mitochondrial uncoupling proteins, which create 

proton leaks across the inner mitochondrial membrane, thus uncoupling oxidative phosphorylation from 
ATP synthesis. The result is energy dissipation in the form of heat. Mitochondrial uncoupling proteins 
have been implicated as modulators of thermoregulation and metabolic rate, and have been proposed as 
potential targets for drugs against metabolic diseases such as obesity (Ricquier, D. et al. (1999) J. Int. 

20 Med. 245:637-642). 
Ion Channels 

The electrical potential of a cell is generated and maintained by controlling the movement of 
ions across the plasma membrane. The movement of ions requires ion channels, which form ion- 
selective pores within the membrane. There are two basic types of ion channels, ion transporters and 

25 gated ion channels. Ion transporters utilize the energy obtained from ATP hydrolysis to actively 

transport an ion against the ion's concentration gradient. Gated ion channels allow passive flow of an 
ion down the ion's electrochemical gradient under restricted conditions. Together, these types of ion 
channels generate, maintain, and utilize an electrochemical gradient that is used in 1) electrical impulse 
conduction down the axon of a nerve cell, 2) transport of molecules into cells against concentration 

30 gradients, 3) initiation of muscle contraction, and 4) endocrine cell secretion. 
Ion Transporters 

Ion transporters generate and maintain the resting electrical potential of a cell. Utilizing the 
energy derived from ATP hydrolysis, they transport ions against the ion's concentration gradient. 
These transmembrane ATPases are divided into three families. The phosphorylated (P) class ion 
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transporters, including Na + -K + ATPase, Ca 2+ - ATPase, and H + -ATPase, are activated by a 
phosphorylation event P-class ion transporters are responsible for maintaining resting potential 
distributions such that cytosolic concentrations of Na + and Ca 2+ are low and cytosolic concentration of 
K* is high. The vacuolar (V) class of ion transporters includes H + pionips on intracellular organelles, 
5 such as lysosomes and Golgi. V-class ion transporters are responsible for generating the low pH within 
the lumen of these organelles that is required for function. The coupling factor (F) class consists of H + 
pumps in the mitochondria. F-class ion transporters utilize a proton gradient to generate ATP from 
ADP and inorganic phosphate (Pi). 

The P-ATPases are hexamers of a 100 kD subunit with ten transmembrane domains and 

10 several large cytoplasmic regions that may play a role in ion binding (Scarborough, G.A. (1999) Curr. 
Opin. Cell Biol. li:517-522). The V-ATPases are composed of two functional domains: the V a 
domain, a peripheral complex responsible for ATP hydrolysis; and the V 0 domain, an integral 
complex responsible for proton translocation across the membrane. Hie F-ATPases are structurally 
and evolutionaiily related to the V- ATPases. The F- ATPase F 0 domain contains 1 2 copies of the c 

15 subunit, a highly hydrophobic protein composed of two transmembrane domains and containing a 
single buried carboxyl group in TM2 that is essential for proton transport. The V-ATPase V 0 domain 
contains three types of homologous c subunits with four or five transmembrane domains and the 
essential carboxyl group in TM4 or TM3. Both types of complex also contain a single a subunit that 
may be involved in regulating the pH dependence of activity (Forgac, M. (1999) J. Biol. Chem. 

20 274:12951*12954). 

The resting potential of the cell is utilized in many processes involving carrier proteins and 
gated ion channels. Carrier proteins utilize the resting potential to transport molecules into and out of 
the cell. Amino acid and glucose transport into many cells is linked to sodium ion co-transport 
(symport) so that the movement of Na + down an electrochemical gradient drives transport of the other 

25 molecule up a concentration gradient. Similarly, cardiac muscle links transfer of Ca 2+ out of the cell 
with transport of Na + into the cell (antiport). 
Gated Ion Channels 

Gated ion channels control ion flow by regulating the opening and closing of pores. The 
ability to control ion flux through various gating mechanisms allows ion channels to mediate such 

30 diverse signaling and homeostatic functions as neuronal and endocrine signaling, muscle contraction, 
fertilization, and regulation of ion and pH balance. Gated ion channels are categorized according to 
the manner of regulating the gating function. Mechanically-gated channels open their pores in 
response to mechanical stress; voltage-gated channels (e.g., Na + , K + , Ca 2+ , and CI' channels) open 
their pores in response to changes in membrane potential; and ligand-gated channels (e.g., 

35 acetylcholine-, serotonin-, and glutamate-gated cation channels, and GAB A- and glycine-gated 
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chloride channels) open their pores in the presence of a specific ion, nucleotide, or neurotransmitter. 
The gating properties of a particular ion channel (i.e., its threshold for and duration of opening and 
closing) are sometimes modulated by association with auxiliary channel proteins and/or post 
translational modifications, such as phosphorylation. 
5 Mechanically-gated or mechanosensitive ion channels act as transducers for the senses of 

touch, hearing, and balance, and also play important roles in cell volume regulation, smooth muscle 
contraction, and cardiac rhythm generation, A stretch-inactivated channel (SIC) was recently cloned 
from rat kidney. The SIC channel belongs to a group of channels which are activated by pressure or 
stress on the cell membrane and conduct both Ca 2+ and Na + (Suzuki, M. et al. (1999) J. Biol. Chem. 
10 274:6330-6335). 

The pore-forming subunits of the voltage-gated cation channels form a superfamily of ion 
channel proteins. The characteristic domain of these channel proteins comprises six transmembrane 
domains (S1-S6), a pore-forming region (P) located between S5 and S6, and intracellular amino and 
carboxy termini. In the Na + and Ca 2+ subfamilies, this domain is repeated four times, while in the K + 

15 channel subfamily, each channel is formed from a tetramer of either identical or dissimilar subunits. 
The P region contains information specifying the ion selectivity for the channel. In the case of K + 
channels, a GYG tripeptide is involved in this selectivity (Ishii, T.M. et al. (1997) Proc. Natl. Acad. 
Sci. USA 94:11651-11656). 

Voltage-gated Na + and K + channels are necessary for the function of electrically excitable cells, 

20 such as nerve and muscle cells. Action potentials, which lead to neurotransmitter release and muscle 
contraction, arise from large, transient changes in the permeability of the membrane to Na + and K + 
ions. Depolarization of the membrane beyond the threshold level opens voltage-gated Na + channels. 
Sodium ions flow into the cell, further depolarizing the membrane and opening more voltage-gated Na + 
channels, which propagates the depolarization down the length of the cell. Depolarization also opens 

25 voltage-gated potassium channels. Consequently, potassium ions flow outward, which leads to 
repolarization of the membrane. Voltage-gated channels utilize charged residues in the fourth 
transmembrane segment (S4) to sense voltage change. The open state lasts only about 1 millisecond, at 
which time the channel spontaneously converts into an inactive state that cannot be opened irrespective 
of the membrane potential. Inactivation is mediated by the channel's N-terminus, which acts as a plug 

30 that closes the pore. The transition from an inactive to a closed state requires a return to resting 
potential. 

Voltage-gated Na + channels areheterotrimeric complexes composed of a 260 kDa pore-forming 
a subunit that associates with two smaller auxiliary subunits, pi and P2. The p2 subunit is a integral 
membrane glycoprotein that contains an extracellular Ig domain, and its association with a and pi 
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subunits correlates with increased functional expression of the channel, a change in its gating 
properties, as well as an increase in whole cell capacitance due to an increase in membrane surface area 
(Isom, l.L. et al. (1995) Cell 83:433-442). 

Non voltage-gated Na + channels include the members of the amiloride-sensitive Na + 
5 channel/degenerin (NaC/DEG) family. Channel subunits of this family are thought to consist of two 
transmembrane domains flanking a long extracellular loop, with the amino and carboxyl termini located 
within the cell. The NaC/DEG family includes the epithelial Na* channel (ENaC) involved in Na + 
reabsorption in epithelia including the airway, distal colon, cortical collecting duct of the kidney, and 
exocrine duct glands. Mutations in ENaC result in pseudohypoaldosteronism type 1 and Liddle's 

10 syndrome (pseudohyperaldosteronism). The NaC/DEG family also includes the recently characterized 
H + -gated cation channels or acid-sensing ion channels (ASIC). ASIC subunits are expressed in the 
brain and form heteromultimeric Na + -permeable channels. These channels require acid pH fluctuations 
for activation. ASIC subunits show homology to the degenerins, a family of mechanically-gated 
channels originally isolated from C. elegans. Mutations in the degenerins cause neurodegeneration. 

15 ASIC subunits may also have a role in neuronal fnntion, or in pain perception, since tissue acidosis 
causes pain (Waldmann, R. andM. Lazdunski (1998) Curr. Opin. Neurobiol. 8:418-424; Eglen, R.M. 
et al. (1999) Trends Pharmacol. Sd. 20:337-342). 

K + channels are located in all cell types, and may be regulated by voltage, ATP concentration, 
or second messengers such as Ca 2+ and cAMP. In non-excitable tissue, K + channels are involved in 

20 protein synthesis, control of endocrine secretions, and the maintenance of osmotic equilibrium across 
membranes. In neurons and other excitable cells, in addition to regulating action potentials and 
repolarizing membranes, K + channels are responsible for setting resting membrane potential. The 
cytosol contains non-diffusible anions and, to balance this net negative charge, the cell contains a Na + - 
K + pump and ion channels that provide the redistribution of Na + , K + , and CT. The pump actively 

25 transports Na + out of the cell and K + into the cell in a 3:2 ratio. Ion channels in the plasma membrane 
allow K + and CI ' to flow by passive diffusion. Because of the high negative charge within the cytosol, 
CI ' flows out of the cell. The flow of K + is balanced by an electromotive force pulling K + into the cell, 
and a K + concentration gradient pushing K + out of the cell. Thus, the resting membrane potential is 
primarily regulated by K + fiow (Salkoff, L. and T. Jegla (1995) Neuron 15:489-492). 

30 Potassium channel subunits of the Shaker -like superfamily all have the characteristic six 

transmembrane/ 1 pore domain structure. Four subunits combine as homo- or heterotetramers to form 
functional K channels. These pore-forming subunits also associate with various cytoplasmic p 
subunits that alter channel inactivation kinetics. The Shaker -like channel family includes the voltage- 
gated K + channels as well as the delayed rectifier type channels such as the human ether-a-go-go 
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related gene (HERG) associated with long QT, a cardiac dysrythmia syndrome (Curran, M.E. (1998) 
Curr. Opin. Biotechnol. 9:565-572; Kaczarowski, GJ. and M.L. Garcia (1999) Curr. Opin. Chem. 
Biol. 3:448-458). 

A second superfamily of K + channels is composed of the inward rectifying channels (Kir). 
5 Kir channels have the property of preferentially conducting K* currents in the inward direction. These 
proteins consist of a single potassium selective pore domain and two transmembrane domains, which 
correspond to the fifth and sixth transmembrane domains of voltage-gated K + channels. Kir subunits 
also associate as tetramers. The Kir family includes ROMK1, mutations in which lead to Bartter 
syndrome, a renal tubular disorder. Kir channels are also involved in regulation of cardiac pacemaker 

10 activity, seizures and epilepsy, and insulin regulation (Doupnik, C.A. et al. (1995) Curr. Opin. 
Ncurobiol. 5:268-277; Curran, supra) . 

The recently recognized TWIK K* channel family includes the mammalian TWIK- 1 , TREK- 1 
and TASK proteins. Members of this family possess an overall structure with four transmembrane 
domains and two P domains. These proteins are probably involved in controlling the resting potential 

15 in a large set of cell types (Duprat, F. et al. (1997) EMBO J 16:5464-5471). 

The voltage-gated Ca 2+ channels have been classified into several subtypes based upon their 
electrophysiological and pharmacological characteristics. L-type Ca 2+ channels are predominantly 
expressed in heart and skeletal muscle where they play an essential role in excitation-contraction 
coupling. T-type channels are important for cardiac pacemaker activity, while N-type and P/Q-type 

20 channels are involved in the control of neurotransmitter release in the central and peripheral nervous 
system. The L-type and N-type voltage-gated Ca 2+ channels have been purified and, though their 
functions differ dramatically, they have similar subunit compositions. The channels are composed of 
three subunits. The subunit forms the membrane pore and voltage sensor, while the c^S and p 
subunits modulate the voltage-dependence, gating properties, and the current amplitude of the channel. 

25 These subunits are encoded by at least six a l9 one c^S, and four p genes. A fourth subunit, y, has been 
identified in skeletal muscle (Walker, D. et al. (1998) J. Biol. Chem. 273:2361-2367; McCleskey, E.W. 
(1994) Curr. Opin. Neurobiol. 4:304-312). 

Chloride channels are necessary in endocrine secretion and in regulation of cytosolic and 
organelle pH. In secretory epithelial cells, CI " enters the cell across a basolateral membrane through an 

30 Na + , K7C1 ' cotransporter, accumulating in the cell above its electrochemical equilibrium concentration. 
Secretion of CI " from the apical surf ace, in response to hormonal stimulation, leads to flow of Na + and 
water into the secretory lumen. The cystic fibrosis transmembrane conductance regulator (CFTR) is a 
chloride channel encoded by the gene for cystic fibrosis, a common fatal genetic disorder in humans. 
CFTR is a member of the ABC transporter family, and is composed of two domains each consisting of 
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six transmembrane domains followed by a nucleotide-binding site. Loss of CFTR function decreases 
transepithelial water secretion and, as a result, the layers of mucus that coat the respiratory tree, 
pancreatic ducts, and intestine are dehydrated and difficult to clear. The resulting blockage of these 
sites leads to pancreatic insufficiency, "meconium ileus", and devastating "chronic obstructive 
5 pulmonary disease" (Al-Awqati, Q. et al. (1992) J. Exp. Biol. 172:245-266). 

The voltage-gated chloride channels (CLC) are characterized by 10-12 transmembrane 
domains, as well as two small globular domains known as CBS domains. The CLC subunits probably 
function as homotetramers. CLC proteins are involved in regulation of cell volume, membrane 
potential stabilization, signal transduction, and transepithelial transport. Mutations in CLC-1, 

10 expressed predominantly in skeletal muscle, are responsible for autosomal recessive generalized 

myotonia and autosomal dominant myotonia congenita, while mutations in the kidney channel CLC-5 
lead to kidney stones (Jentsch, T.J. (1996) Curr. Opin. Neurobiol. 6:303-310). 

Ligand-gated channels open their pores when an extracellular or intracellular mediator binds to 
the channel. Neurotransmitter-gated channels are channels that open when a neurotransmitter binds to 

15 their extracellular domain. These channels exist in the postsynaptic membrane of nerve or muscle cells. 
There are two types of neurotransmitter-gated channels. Sodium channels open in response to 
excitatory neurotransmitters, such as acetylcholine, glutamate, and serotonin. This opening causes an 
influx of Na + and produces the initial localized depolarization that activates the voltage-gated channels 
and starts the action potential. Chloride channels open in response to inhibitory neurotransmitters, such 

20 as y-aminobutyric acid (GAB A) and glycine, leading to hyperpolarization of the membrane and the 
subsequent generation of an action potential. Neurotransmitter-gated ion channels have four 
transmembrane domains and probably function as pentamers (Jentsch, supra) . Amino acids in the 
second transmembrane domain appear to be important in determining channel permeation and 
selectivity (Sather, W.A. et al. (1994) Curr. Opin. Neurobiol. 4:313-323). 

25 Ligand-gated channels can be regulated by intracellular second messengers. For example, 

calcium-activated K + channels are gated by internal calcium ions. In nerve cells, an influx of calcium 
during depolarization opens K + channels to modulate the magnitude of the action potential (Ishi et al. , 
supra) . The large conductance (BK) channel has been purified from brain and its subunit composition 
determined. The a subunit of the BK channel has seven rather than six transmembrane domains in 

30 contrast to voltage-gated K + channels. The extra transmembrane domain is located at the subunit N- 
terminus. A 28-amino-acid stretch in the C-terminal region of the subunit (the "calcium bowl" region) 
contains many negatively charged residues and is thought to be the region responsible for calcium 
binding. The [} subunit consists of two transmembrane domains connected by a glycosylated 
extracellular loop, with intracellular N- and C-termini (Kaczorowski, supra : Vergara, C. et al. (1998) 
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Curr. Opin NeurobioL 8:321-329). 

Cyclic nucleotide-gated (CNG) channels are gated by cytosolic cyclic nucleotides. The best 
examples of these are the cAMP-gated Na + channels involved in olfaction and the cGMP-gated cation 
channels involved in vision. Both systems involve ligand-mediated activation of a G-protein coupled 
5 receptor which then alters the level of cyclic nucleotide within the cell. CNG channels also represent a 
major pathway for Ca 2+ entry into neurons, and play roles in neuronal development and plasticity. 
CNG channels are tetramers containing at least two types of subunits, an a subunit which can form 
functional homomeric channels, and a £ subunit, which modulates the channel properties. All CNG 
subunits have six transmembrane domains and a pore forming region between the fifth and sixth 

10 transmembrane domains, similar to voltage-gated K + channels. A large C-tenninal domain contains a 
cyclic nucleotide binding domain, while the N-texminal domain confers variation among channel 
subtypes (Zufall, F. et al. (1997) Curr. Opin NeurobioL 7:404-412). 

The activity of other types of ion channel proteins may also be modulated by a variety of 
intracellular signalling proteins. Many channels have sites for phosphorylation by one or more protein 

15 kinases including protein kinase A, protein kinase C, tyrosine kinase, and casein kinase II, all of 
which regulate ion channel activity in cells. Kir channels are activated by the binding of the Gfty 
subunits of heterotrimeric G-proteins (Reimann, R andF.M. Ashcroft (1999) Curr. Opin. Cell. Biol: 
1 1 :503-508). Other proteins are involved in the localization of ion channels to specific sites in the cell 
membrane. Such proteins include the PDZ domain proteins known as MAGUKs (membrane-associated 

20 guanylate kinases) which regulate the clustering of ion channels at neuronal synapses (Craven, S.E. and 
D.S. Bredt (1998) Cell 93:495-498). 
Disease Correlation 

The etiology of numerous human diseases and disorders can be attributed to defects in the 
transport of molecules across membranes. Defects in the trafficking of membrane-bound transporters 
25 and ion channels are associated with several disorders, e.g., cystic fibrosis, glucose-galactose 

malabsorption syndrome, hypercholesterolemia, von Gierke disease, and certain forms of diabetes 
mellitus. Single-gene defect diseases resulting in an inability to transport small molecules across 
membranes include, e.g., cystinuria, iminoglycinuria, Hartup disease, and Fanconi disease (vant Hoff, 
W.G. (1996) Exp. Nephrol. 4:253-262; Talente, G.M. et al. (1994) Ann. Intern. Med. 120:218-226; 
30 and Chillon, M. et al. (1995) New Engl. J. Med. 332:1475-1480). 

Human diseases caused by mutations in ion channel genes include disorders of skeletal muscle, 
cardiac muscle, and the central nervous system. Mutations in the pore-forming subunits of sodium and 
chloride channels cause myotonia, a muscle disorder in which relaxation after voluntary contraction is 
delayed. Sodium channel myotonias have been treated with channel blockers. Mutations in muscle 
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sodium and calcium channels cause forms of periodic paralysis, while mutations in the sarcoplasmic 
calcium release channel, T-tubule calcium channel, and muscle sodium channel cause malignant 
hyperthermia. Cardiac arrythmia disorders such as the long QT syndromes and idiopathic ventricular 
fibrillation are caused by mutations in potassium and sodium channels (Cooper, E.C. and L.Y. Jan 
5 (1998) Proc. Natl. Acad. Sci. USA 96:4759-4766). All four known human idiopathic epilepsy genes 
code for ion channel proteins (Berkovic, S.F. and I.E. Scheffer (1999) Curr. Opin. Neurology 12:177- 
182). Other neurological disorders such as ataxias, hemiplegic migraine and hereditary deaihess can 
also result from mutations in ion channel genes (Jen, J. (1999) Curr. Opin. Neurobiol. 9:274-280; 
Cooper, supra). 

0 Ion channels have been the target for many drug therapies. Neurotransmitter-gated channels 

have been targeted in therapies for treatment of insomnia, anxiety, depression, and schizophrenia. 
Voltage-gated channels have been targeted in therapies for arrhythmia, ischemic stroke, head trauma, 
and neurodegenerative disease (Taylor, CP. and L.S. Narasimhan (1997) Adv. Pharmacol. 39:47-98). 
Various classes of ion channels also play an important role in the perception of pain, and thus are 

5 potential targets for new analgesics. These include the vanilloid-gated ion channels, which are activated 
by the vanilloid capsaicin, as well as by noxious heat Local anesthetics such as lidocaine and 
mexiletine which blockade voltage-gated Na + channels have been useful in the treatment of neuropathic 
pain (Eglen, supra) . 

Ion channels in the immune system have recently been suggested as targets for 

0 immunomodulation. T-cell activation depends upon calcium signaling, and a diverse set of T-cell 

specific ion channels has been characterized that affect this signaling process. Channel blocking agents 
can inhibit secretion of lymphokines, cell proliferation, and killing of target cells. A peptide antagonist 
of the T-cell potassium channel Kvl .3 was found to suppress delayed-type hypersensitivity and 
allogenic responses in pigs, validating the idea of channel blockers as safe and efficacious 

5 immunosuppressants (Calahan, M.D. and K.G. Chandy (1997) Curr. Opin. Biotechnol. 8:749-756). 

The discovery of new transporters and ion channels and the polynucleotides encoding them 
satisfies a need in the art by providing new compositions which are useful in the diagnosis, prevention, 
and treatment of transport, neurological, muscle, and immunological disorders, and in the assessment of 
the effects of exogenous compounds on the expression of nucleic acid and amino acid sequences of 

[) transporters and ion channels. 

SUMMARY OF THE INVENTION 

The invention features purified polypeptides, transporters and ion channels, referred to 
collectively as "TRICH" and individually as "TRICH-1," "TRICH-2," "TRICH-3," "TRICH-4," 
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4 TRICH-5," "TRICH-6," "TRICH-7," "TRICH-8," 'TRICH-9," "TRICH-10," "TRICH-ll," 
"TRICH-12," "TRICH-13," "TRICH-14," "TRICH-15," "TRICH-16," "TRICH-17," 'TRICH-18," 
' TRICH-19," "TRICH-20," 'TRICH-21," "TRICH-22," "TRICH-23," 'TRICH-24," "TRICH-25," 
"TRICH^," and "TRICH-27." In one aspect, the invention provides an isolated polypeptide 
5 comprising an amino acid sequence selected from the group consisting of a) an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-27, b) a naturally occurring amino acid sequence 
having at least 90% sequence identity to an amino acid sequence selected from the group consisting of 
SEQ ID NO:l-27, c) a biologically active fragment of an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-27, and d) an immunogenic fragment of an amino acid sequence selected 

10 from the group consisting of SEQ ID NO: 1-27. In one alternative, the invention provides an isolated 
polypeptide comprising the amino acid sequence of SEQ ID NO: 1-27. 

The invention further provides an isolated polynucleotide encoding a polypeptide comprising an 
amino acid sequence selected from the group consisting of a) an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-27, b) a naturally occurring amino acid sequence having at least 

15 90% sequence identity to an amino acid sequence selected from the group consisting of SEQ ED NO:l- 
27, c) a biologically active fragment of an amino acid sequence selected from the group consisting of 
SEQ ID NO: 1-27, and d) an immunogenic fragment of an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-27. In one alternative, the polynucleotide encodes a polypeptide selected 
from the group consisting of SEQ ID NO: 1-27. In another alternative, the polynucleotide is selected 

20 from the group consisting of SEQ ID NO:28-54. * 
Additionally, the invention provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide comprising an amino acid 
sequence selected from the group consisting of a) an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-27, b) a naturally occurring amino acid sequence having at least 90% 

25 sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-27, c) 
a biologically active fragment of an amino acid sequence selected from the group consisting of SEQ ID 
NO: 1-27, and d) an immunogenic fragment of an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-27. In one alternative, the invention provides a cell transformed with the 
recombinant polynucleotide. In another alternative, the invention provides a transgenic organism 

30 comprising the recombinant polynucleotide. 

The invention also provides a method for producing a polypeptide comprising an amino acid 
sequence selected from the group consisting of a) an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-27, b) a naturally occurring amino acid sequence having at least 90% 
sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-27, c) 
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a biologically active fragment of an amino acid sequence selected from the group consisting of SEQ ED 
NO: 1-27, and d) an immunogenic fragment of an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-27. The method comprises a) culturing a cell under conditions suitable for 
expression of the polypeptide, wherein said cell is transformed with a recombinant polynucleotide 
5 comprising a promoter sequence operably linked to a polynucleotide encoding the polypeptide, and b) 
recovering the polypeptide so expressed. 

Additionally, the invention provides an isolated antibody which specifically binds to a 
polypeptide comprising an amino acid sequence selected from the group consisting of a) an amino acid 
sequence selected from the group consisting of SEQ ID NO:l-27, b) a naturally occurring amino acid 

10 sequence having at least 90% sequence identity to an amino acid sequence selected from the group 
consisting of SEQ ED NO: 1-27, c) a biologically active fragment of an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-27, and d) an immunogenic fragment of an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-27. 

The invention further provides an isolated polynucleotide comprising a polynucleotide sequence 

15 selected from the group consisting of a) a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:28-54, b) a naturally occurring polynucleotide sequence having at least 90% sequence 
identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, c) a 
polynucleotide sequence complementary to a), d) a polynucleotide sequence complementary to b), and e) 
an RNA equivalent of a)-d). In one alternative, the polynucleotide comprises at least 60 contiguous 

20 nucleotides. 

Additionally, the invention provides a method for detecting a target polynucleotide in a sample, 
said target polynucleotide having a sequence of a polynucleotide comprising a polynucleotide sequence 
selected from the group consisting of a) a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:28-54, b) a naturally occurring polynucleotide sequence having at least 90% sequence 

25 identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, c) a 

polynucleotide sequence complementary to a), d) a polynucleotide sequence complementary to b), and e) 
an RNA equivalent of a)-d). The method comprises a) hybridizing the sample with a probe comprising 
at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide 
in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions 

30 whereby a hybridization complex is formed between said probe and said target polynucleotide or 
fragments thereof, and b) detecting the presence or absence of said hybridization complex, and 
optionally, if present, the amount thereof. In one alternative, the probe comprises at least 60 contiguous 
nucleotides. 

The invention further provides a method for detecting a target polynucleotide in a sample, said 
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target polynucleotide having a sequence of a polynucleotide comprising a polynucleotide sequence 
selected from the group consisting of a) a polynucleotide sequence selected from the group consisting of 
SEQ ED NO:28-54, b) a naturally occurring polynucleotide sequence having at least 90% sequence 
identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, c) a 
5 polynucleotide sequence complementary to a), d) a polynucleotide sequence complementary to b), and e) 
an RNA equivalent of a)-d). The method comprises a) amplifying said target polynucleotide or 
fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or 
absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the 
amount thereof. 

10 The invention further provides a composition comprising an effective amount of a polypeptide 

comprising an amino acid sequence selected from the group consisting of a) an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-27, b) a naturally occurring amino acid sequence 
having at least 90% sequence identity to an amino acid sequence selected from the group consisting of 
SEQ ID NO: 1-27, c) a biologically active fragment of an amino acid sequence selected from the group 

15 consisting of SEQ ID NO:l-27, and d) an immunogenic fragment of an amino acid sequence selected 
from the group consisting of SEQ ED NO: 1-27, and a pharmaceutical^ acceptable excipient In one 
embodiment, the composition comprises an amino acid sequence selected from the group consisting of 
SEQ ID NO:l-27. The invention additionally provides a method of treating a disease or condition 
associated with decreased expression of functional TRICH, comprising administering to a patient in 

20 need of such treatment the composition. 

The invention also provides a method for screening a compound for effectiveness as an 
agonist of a polypeptide comprising an amino acid sequence selected from the group consisting of a) an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-27, b) a naturally occurring 
amino acid sequence having at least 90% sequence identity to an amino acid sequence selected from the 

25 group consisting of SEQ ID NO: 1-27, c) a biologically active fragment of an amino acid sequence 

selected from the group consisting of SEQ ID NO:l-27, and d) an immunogenic fragment of an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-27. The method comprises a) 
exposing a sample comprising the polypeptide to a compound, and b) detecting agonist activity in the 
sample. In one alternative, the invention provides a composition comprising an agonist compound 

30 identified by the method and a pharmaceutically acceptable excipient. In another alternative, the 

invention provides a method of treating a disease or condition associated with decreased expression of 
functional TRICH, comprising administering to a patient in need of such treatment the composition. 

Additionally, the invention provides a method for screening a compound for effectiveness as 
an antagonist of a polypeptide comprising an amino acid sequence selected from the group consisting 
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of a) an amino acid sequence selected from the group consisting of SEQ ID NO:l-27, b) a naturally 
occurring amino acid sequence having at least 90% sequence identity to an amino acid sequence 
selected from the group consisting of SEQ ED NO: 1-27, c) a biologically active fragment of an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-27, and d) an immunogenic fragment 
of an amino acid sequence selected from the group consisting of SEQ ID NO: 1-27. The method 
comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting 
antagonist activity in the sample. In one alternative, the invention provides a composition comprising 
an antagonist compound identified by the method and apharmaceutically acceptable excipient In 
another alternative, the invention provides a method of treating a disease or condition associated with 
overexpression of functional TRICH, comprising administering to a patient in need of such treatment 
the composition. 

The invention further provides a method of screening for a compound that specifically binds 
to a polypeptide comprising an amino acid sequence selected from the group consisting of a) an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-27, b) a naturally occurring amino 
acid sequence having at least 90% sequence identity to an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-27, c) a biologically active fragment of an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-27, and d) an immunogenic fragment of an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-27. The method comprises a) combining 
the polypeptide with at least one test compound under suitable conditions, and b) detecting binding 
of the polypeptide to the test compound, thereby identifying a compound that specifically binds to the 
polypeptide. 

The invention further provides a method of screening for a compound that modulates the 
activity of a polypeptide comprising an amino acid sequence selected from the group consisting of a) an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-27, b) a naturally occurring 
amino acid sequence having at least 90% sequence identity to an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-27, c) a biologically active fragment of an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-27, and d) an immunogenic fragment of an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-27. The method comprises a) 
combining the polypeptide with at least one test compound under conditions permissive for the 
activity of the polypeptide, b) assessing the activity of the polypeptide in the presence of the test 
compound, and c) comparing the activity of the polypeptide in the presence of the test compound with 
the activity of the polypeptide in the absence of the test compound, wherein a change in the activity of 
the polypeptide in the presence of the test compound is indicative of a compound that modulates the 
activity of the polypeptide. 
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Hie invention further provides a method for screening a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
sequence selected from the group consisting of SEQ ID NO: 28-54, the method comprising a) 
exposing a sample comprising the target polynucleotide to a compound, and b) detecting altered 
5 expression of the target polynucleotide. 

The invention further provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sample containing nucleic acids with the test compound; 
b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide comprising a polynucleotide sequence selected from the 

10 group consisting of i) a polynucleotide sequence selected from the group consisting of SEQ ID 

NO:28-54, ii) a naturally occurring polynucleotide sequence having at least 90% sequence identity to 
a polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, hi) a 
polynucleotide sequence complementary to i), iv) a polynucleotide sequence complementary to ii), 
and v) an RNA equivalent of i)-iv). Hybridization occurs under conditions whereby a specific 

15 hybridization complex is formed between said probe and a target polynucleotide in the biological 
sample, said target polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of i) a polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, 
ii) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, hi) a 

20 polynucleotide sequence complementary to i), iv) a polynucleotide sequence complementary to ii), 
and v) an RNA equivalent of i)-iv). Alternatively, the target polynucleotide comprises a fragment of 
a polynucleotide sequence selected from the group consisting of i)-v) above; c) quantifying the 
amount of hybridization complex; and d) comparing the amount of hybridization complex in the 
treated biological sample with the amount of hybridization complex in an untreated biological 

25 sample, wherein a difference in the amount of hybridization complex in the treated biological sample 
is indicative of toxicity of the test compound. 

BRIEF DESCRIPTION OF THE TABLES 

Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
30 sequences of the present invention. 

Table 2 shows the GenBank identification number and annotation of the nearest GenBank 
homolog for each polypeptide of the invention. The probability score for the match between each 
polypeptide and its GenBank homolog is also shown. 

Table 3 shows structural features of each polypeptide sequence, including predicted motifs and 
35 domains, along with the methods, algorithms, and searchable databases used for analysis of each 

16 



NSDOCID: <WO 0146258A2_L> 



WO 01/46258 



PCT/USOO/35095 



polypeptide. 

Table 4 lists the cDNA and genomic DNA fragments which were used to assemble each 
polynucleotide sequence, along with selected fragments of the polynucleotide sequences. 

Table 5 shows the representative cDN A library for each polynucleotide of the invention. 

Table 6 provides an appendix which describes the tissues and vectors used for construction of 
the cDN A libraries shown in Table 5 . 

Table 7 shows the tools, programs, and algorithms used to analyze the polynucleotides and 
polypeptides of the invention, along with applicable descriptions, references, and threshold parameters. 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleotide sequences, and methods are described, it is understood 
that this invention is not limited to the particular machines, materials and methods described, as these 
may vary. It is also to be understood that the terminology used herein is for the purpose of describing 
particular embodiments only, and is not intended to limit the scope of the present invention which will 
be limited only by the appended claims. 

It must be noted that as used herein and in the appended claims, the singular forms "a," "an," 
and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, a 
reference to "a host cell" includes a plurality of such host cells, and a reference to "an antibody" is a 
reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so 
forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same meanings 
as commonly understood by one of ordinary skill in the art to which this invention belongs. Although 
any machines, materials, and methods similar or equivalent to those described herein can be used to 
practice or test the present invention, the preferred machines, materials and methods are now described. 
All publications mentioned herein are cited for the purpose of describing and disclosing the cell lines, 
protocols, reagents and vectors which are reported in the publications and which might be used in 
connection with the invention. Nothing herein is to be construed as an admission that the invention is 
not entitled to antedate such disclosure by virtue of prior invention. 
DEFINITIONS 

"TRICH" refers to the amino acid sequences of substantially purified TRICH obtained from 
any species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and 
human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant. 

The term "agonist" refers to a molecule which intensifies or mimics the biological activity of 
TRICH. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
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compound or composition which modulates the activity of TRICH either by directly interacting with 
TRICH or by acting on components of the biological pathway in which TRICH participates. 

An "allelic variant" is an alternative form of the gene encoding TRICH. Allelic variants may 
result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
5 polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the others, one or more times in 
a given sequence. 

10 "Altered" nucleic acid sequences encoding TRICH include those sequences with deletions, 

insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as TRICH or a 
polypeptide with at least one functional characteristic of TRICH. Included within this definition are 
polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe of 
the polynucleotide encoding TRICH, and improper or unexpected hybridization to allelic variants, with 

15 a locus other than the normal chromosomal locus for the polynucleotide sequence encoding TRICH. 
The encoded protein may also be "altered," and may contain deletions, insertions, or substitutions of 
amino acid residues which produce a silent change and result in a functionally equivalent TRICH. 
Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, 
solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the . 

20 biological or immunological activity of TRICH is retained. For example, negatively charged amino 
acids may include aspartic acid and glutamic acid, and positively charged amino acids may include 
lysine and arginine. Amino acids with uncharged polar side chains having similar hydrophilicity 
values may include: asparagine and glutamine; and serine and threonine. Amino acids with 
uncharged side chains having similar hydrophilicity values may include: leucine, isoleucine, and 

25 valine; glycine and alanine; and phenylalanine and tyrosine. 

The terms "amino acid" and "amino acid sequence" refer to an oligopeptide, peptide, 
polypeptide, or protein sequence, or a fragment of any of these, and to naturally occurring or synthetic 
molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally occurring 
protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid sequence 

30 to the complete native amino acid sequence associated with the recited protein molecule. 

"Amplification" relates to the production of additional copies of a nucleic acid sequence. 
Amplification is generally carried out using polymerase chain reaction (PCR) technologies well known 
in the art. 

The term "antagonist" refers to a molecule which inhibits or attenuates the biological activity of 
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TRICH. Antagonists may include proteins such as antibodies, nucleic acids, carbohydrates, small 
molecules, or any other compound or composition which modulates the activity of TRICH either by 
directly interacting with TRICH or by acting on components of the biological pathway in which TRICH 
participates. 

5 The term "antibody" refers to intact immunoglobulin molecules as well as to fragments thereof, 

such as Fab, F(ab% and Fv fragments, which are capable of binding an epitopic determinant. 
Antibodies that bind TRICH polypeptides can be prepared using intact polypeptides or using fragments 
containing small peptides of interest as the immunizing antigen. The polypeptide or oligopeptide used 
to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, or 

10 synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used carriers 
that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole 
limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal. 

The term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 
makes contact with a particular antibody. When a protein or a fragment of a protein is used to 

15 immunize a host animal, numerous regions of the protein may induce the production of antibodies which 
bind specifically to antigenic determinants (particular regions or three-dimensional structures on the 
protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen used to 
elicit the immune response) for binding to an antibody. 

The term "antisense" refers to any composition capable of base-pairing with the "sense" 

20 (coding) strand of a specific nucleic acid sequence. Antisense compositions may include DNA; RNA; 
peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 
phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2-methoxyethyl sugars or 2-methoxyethoxy sugars; or oligonucleotides having 
modified bases such as 5-methyl cytosine, T-deoxyuracil, or 7~deaza-2'-deoxyguanosine. Antisense 

25 molecules may be produced by any method including chemical synthesis or transcription. Once 
introduced into a cell, the complementary antisense molecule base-pairs with a naturally occurring 
nucleic acid sequence produced by the cell to form duplexes which block either transcription or 
translation. The designation "negative" or "minus" can refer to the antisense strand, and the 
designation "positive" or "plus" can refer to the sense strand of a reference DNA molecule. 

30 The term "biologically active" refers to a protein having structural, regulatory, or biochemical 

functions of a naturally occurring molecule. Likewise, "immunologically active" or "immunogenic" 
refers to the capability of the natural, recombinant, or synthetic TRICH, or of any oligopeptide thereof, 
to induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 
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"Complementary" describes the relationship between two single-stranded nucleic acid 
sequences that anneal by base-pairing. For example, 5-AGT-3' pairs with its complement, 
3'-TCA-5\ 

A "composition comprising a given polynucleotide sequence" and a "composition comprising a 

5 given amino acid sequence" refer broadly to any composition containing the given polynucleotide or 
amino acid sequence. The composition may comprise a dry formulation or an aqueous solution. 
Compositions comprising polynucleotide sequences encoding TRICH or fragments of TRICH may be 
employed as hybridization probes. The probes may be stored in freeze-dried form and may be 
associated with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be 

10 deployed in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., sodium dodecyl sulfate; 
SDS), and other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to repeated 
DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (PE Biosystems, 
Foster City CA) in the 5* and/or the 3' direction, and resequenced, or which has been assembled from 

15 one or more overlapping cDNA, EST, or genomic DNA fragments using a computer program for 

fragment assembly, such as the GELVEEW fragment assembly system (GCG, Madison WI) or Phrap 
(University of Washington, Seattle WA). Some sequences have been both extended and assembled to 
produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that are predicted to least 

20 interfere with the properties of the original protein, i.e., the structure and especially the function of the 
protein is conserved and not significantly changed by such substitutions. The table below shows amino 
acids which may be substituted for an original amino acid in a protein and which are regarded as 
conservative amino acid substitutions. 



Original Residue 


Conservative Substitution 


Ala 


Gly, Ser 


Arg 


His, Lys 


Asn 


Asp, Gin, His 


Asp 


Asn, Glu 


Cys 


Ala, Ser 


Gin 


Asn, Glu, His 


Glu 


Asp, Gin, His 


Gly 


Ala 


His 


Asn, Arg, Gin, Glu 


lie 


Leu, Val 


Leu 


He, Val 


Lys 


Arg, Gin, Glu 


Met 


Leu, He 


Phe 


His, Met, Leu, Trp, Tyr 


Ser 


Cys, Thr 
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Thr 
Trp 
Tyr 
Val 



Ser, Val 
Phe, Tyr 
His, Phe, Trp 
lie, Leu, Thr 



5 



Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 
backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of the 
side chain. 



absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide. Chemical 
modifications of a polynucleotide can include, for example, replacement of hydrogen by an alkyl, acyl, 
hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains at least one 

15 biological or immunological function of the natural molecule. A derivative polypeptide is one modified 
by glycosylation, pegylation, or any similar process that retains at least one biological or immunological 
function of the polypeptide from which it was derived. 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
measurable signal and is covalently or noncovalently joined to a polynucleotide or polypeptide. 

20 A "fragment" is a unique portion of TRICH or the polynucleotide encoding TRICH which is 

identical in sequence to but shorter in length than the parent sequence. A fragment may comprise up 
to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise from 5 to 1000 contiguous nucleotides or amino acid residues. A fragment 
used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 

25 15,16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino acid 
residues in length. Fragments may be preferentially selected from certain regions of a molecule. For 
example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected 
from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain 
defined sequence. Clearly these lengths are exemplary, and any length that is supported by the 

30 specification, including the Sequence Listing., tables, and figures, may be encompassed by the present 
embodiments. 

A fragment of SEQ ID NO:28-54 comprises a region of unique polynucleotide sequence that 
specifically identifies SEQ ID NO:28-54, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SEQ ED NO:28-54 is useful, for 
35 example, in hybridization and amplification technologies and in analogous methods that distinguish 
SEQ ID NO:28-54 from related polynucleotide sequences. The precise length of a fragment of SEQ 



10 



A "deletion" refers to a change in the amino acid or nucleotide sequence that results in the 
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ID NO:28-54 and the region of SEQ ID NO:28-54 to which the fragment corresponds are routinely 
determinable by one of ordinary skill in the art based on the intended purpose for the fragment 

A fragment of SEQ ID NO: 1 -27 is encoded by a fragment of SEQ ID NO:28-54. A fragment 
of SEQ ID NO: 1-27 comprises a region of unique amino acid sequence that specifically identifies 
5 SEQ ID NO:l-27. For example, a fragment of SEQ ID NO: 1-27 is useful as an immunogenic peptide 
for the development of antibodies that specifically recognize SEQ ID NO: 1-27. The precise length of 
a fragment of SEQ ID NO: 1-27 and the region of SEQ ID NO: 1 -27 to which the fragment 
corresponds are routinely determinable by one of ordinary skill in the art based on the intended 
purpose for the fragment 

10 A "full length" polynucleotide sequence is one containing at least a translation initiation codon 

(e.g. , methionine) followed by an open reading frame and a translation termination codon. A "full 
length" polynucleotide sequence encodes a "full length" polypeptide sequence. 

"Homology" refers to sequence similarity or, interchangeably, sequence identity, between two 
or more polynucleotide sequences or two or more polypeptide sequences. 

15 The terms "percent identity" and "% identity," as applied to polynucleotide sequences, refer to 

the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in 
the sequences being compared in order to optimize alignment between two sequences, and therefore 
achieve a more meaningful comparison of the two sequences. 

20 Percent identity between polynucleotide sequences may be determined using the default 

parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e sequence 
alignment program. This program is part of the LASERGENE software package, a suite of molecular 
biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is described in Higgins, D.G. 
and P.M. Sharp (1989) CABIOS 5:151-153 and in Higgins, D.G. et al. (1992) CABIOS 8:189-191. 

25 For pairwise alignments of polynucleotide sequences, the default parameters are set as follows: 

Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The "weighted" residue weight table is 
selected as the default Percent identity is reported by CLUSTAL V as the "percent similarity" between 
aligned polynucleotide sequences. 

Alternatively, a suite of commonly used and freely available sequence comparison algorithms is 

30 provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search 
Tool (BLAST) (Altschul, S.R et al. (1990) J. Mol. Biol. 215:403-410), which is available from several 
sources, including the NCBI, Bethesda, MD, and on the Internet at 

http://www.ncbi.rilm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis 
programs including "blastn," that is used to align a known polynucleotide sequence with other 
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polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih. gov/gor£fol2.html. The 
"BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 
5 programs are commonly used with gap and other parameters set to default settings. For example, to 
compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
2.0.12 (April-21-2000) set at default parameters. Such default parameters maybe, for example: 

Matrix: BLOSVM62 

Reward for match: 1 
10 Penalty for mismatch: -2 

Open Gap: 5 and Extension r Gap: 2 penalties 

Gap x drop-off: 50 

Expect: 10 

Word Size: 11 
15 Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, as 
defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over 
the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at 
least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides. Such 
20 lengths are exemplary only, and it is understood that any fragment length supported by the sequences 
shown herein, in the tables, figures, or Sequence Listing, may be used to describe a length over which 
percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in 
25 a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences 
that all encode substantially the same protein. 

The phrases "percent identity" and "% identity," as applied to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aligned using a 
standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some alignment 
30 methods take into account conservative amino acid substitutions. Such conservative substitutions, 
explained in more detail above, generally preserve the charge andjiydrophobicity at the site of 
substitution, thus preserving the structure (and therefore function) of the polypeptide. 

Percent identity between polypeptide sequences may be determined using the default parameters 
of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e sequence alignment 
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program (described and referenced above). For pairwise alignments of polypeptide sequences using 
CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap penalty = 3 ? window=5, and 
' diagonals saved' =5. The PAM250 matrix is selected as the default residue weight table. As with 
polynucleotide alignments, the percent identity is reported by CLUSTAL V as the "percent similarity" 
5 between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 2.0. 1 2 
(April-21-2000) with blastp set at default parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 
10 Open Gap: 11 and Extension Gap: 1 penalties 

Gap x drop-off: 50 

Expect: 10 

Word Size: 3 

Filter: on 

15 Percent identity may be measured over the length of an entire defined polypeptide sequence, for 

example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 
example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, 
a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 
contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length 

20 supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be used to 
describe a length over which percentage identity may be measured. 

"Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

25 The term "humanized antibody" refers to an antibody molecule in which the amino acid 

sequence in the non-antigen binding regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

"Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing under defined hybridization conditions. Specific 

30 hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
after the "washing" step(s). The washing step(s) is particularly important in determining the stringency 
of the hybridization process, with more stringent conditions allowing less non-specific binding, i.e., 
binding between pairs of nucleic acid strands that are not perfectly matched. Permissive conditions for 
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annealing of nucleic acid sequences are routinely determinable by one of ordinary skill in the art and 
may be consistent among hybridization experiments, whereas wash conditions may be varied among 
experiments to achieve the desired stringency, and therefore hybridization specificity. Permissive 
annealing conditions occur, for example, at 68°C in the presence of about 6 x SSC, about 1% (w/v) 

5 SDS, and about 100 fig/ml sheared, denatured salmon sperm DNA. 

Generally, stringency of hybridization is expressed, in part, with reference to the temperature 
under which the wash step is carried out Such wash temperatures are typically selected to be about 
5°C to 20°C lower than the thermal melting point (T^ for the specific sequence at a defined ionic 
strength and pH. The T m is the temperature (under defined ionic strength and pH) at which 50% of the 

0 target sequence hybridizes to a perfectly matched probe. An equation for calculating T m and conditions 
for nucleic acid hybridization are well known and can be found in Sambrook, J. et al. (1989) Molecular 
Cloning: A Laboratory Manual, 2 nd ed., vol. 1-3, Cold Spring Harbor Press^ Plainview NY; specifically 
see volume 2, chapter 9. 

High stringency conditions for hybridization between polynucleotides of the present invention 

5 include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Alternatively, temperatures of about 65°C, 60°C, 55°C, or 42°C may be used. SSC concentration may 
be varied from about 0. 1 to 2 x SSC, with SDS being present at about 0. 1 %. Typically, blocking 
reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, 
sheared and denatured salmon sperm DNA at about 100-200 |iig/ml. Organic solvent, such as 

0 formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, 
such as for RNA:DNA hybridizations. Useful variations on these wash conditions will be readily 
apparent to those of ordinary skill in the art. Hybridization, particularly undo* high stringency 
conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity is 
strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

5 The term "hybridization complex" refers to a complex formed between two nucleic acid 

sequences by virtue of the formation of hydrogen bonds between complementary bases. A hybridization 
complex may be formed in solution (e.g., C 0 t or R 0 t analysis) or formed between one nucleic acid 
sequence present in solution and another nucleic acid sequence immobilized on a solid support (e.g., 
paper, membranes, filters, chips, pins or glass slides, or any other appropriate substrate to which cells 

Q or their nucleic acids have been fixed). 

The words "insertion" and "addition" refer to changes in an amino acid or nucleotide sequence 
resulting in the addition of one or more amino acid residues or nucleotides, respectively. 

"Immune response" can refer to conditions associated with inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression of 
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various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect cellular 
and systemic defease systems. 

An "immunogenic fragment" is a polypeptide or oligopeptide fragment of TRICH which is 
capable of eliciting an immune response when introduced into a living organism, for example, a 
5 mammal. The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment of 
TRICH which is useful in any of the antibody production methods disclosed herein or known in the art. 

The term "microarray" refers to an arrangement of a plurality of polynucleotides, polypeptides, 
or other chemical compounds on a substrate. 

The terms "element" and "array element" refer to a polynucleotide, polypeptide, or other 
10 chemical compound having a unique and defined position on a microarray. 

The term "modulate" refers to a change in the activity of TRICH. For example, modulation 
may cause an increase or a decrease in protein activity, binding characteristics, or any other biological, 
functional, or immunological properties of TRICH. 

The phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
15 polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

"Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably , 
20 linked to a coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

"Peptide nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
25 amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. PNAs 
preferentially bind complementary single stranded DNA or RNA and stop transcript elongation, and 
may be pegylated to extend their lifespan in the cell. 

"Post-translational modification" of an TRICH may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the 
30 art. These processes may occur synthetically or biochemically. Biochemical modifications will vary by 
cell type depending on the enzymatic milieu of TRICH. 

"Probe" refers to nucleic acid sequences encoding TRICH, their complements, or fragments 
thereof, which are used to detect identical, allelic or related nucleic acid sequences. Probes are 
isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical 
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labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. "Primers" are 
short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by 
complementary base-pairing. The primer may then be extended along the target DNA strand by a DNA 
polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic acid 
5 sequence, e.g., by the polymerase chain reaction (PCR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may 

10 be considerably longer than these examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in the references, for 
example Sambrook, J. et al. (1989) Molecular Cloning: A Laboratory Manual. 2 nd ed., vol. 1-3, Cold 
Spring Harbor Press, Plainview NY; Ausubel, F.M. et al. (1987) Current Protocols in Molecular 

15 Biology , Greene Publ. Assoc. & Wiley-Intersciences, New York NY; Innis, M. et al. (1990) PCR 

Protocols, A Guide to Methods and Applications * Academic Press, San Diego CA PCR primer pairs 
can be derived from a known sequence, for example, by using computer programs intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). . 

20 Oligonucleotides for use as primers are selected using software known in the art for such 

purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 
nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection 
programs have incorporated additional features for expanded capabilities. For example, the PrimOU 

25 primer selection program (available to the public from the Genome Center at University of Texas South 
West Medical Center, Dallas TX) is capable of choosing specific primers from megabase sequences 
and is thus useful for designing primers on a genome- wide scope. The Primer3 primer selection 
program (available to the public from the Whitehead Institute/MIT Center for Genome Research, 
Cambridge MA) allows the user to input a "misprinting library," in which sequences to avoid as primer 

30 binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for 
microarrays. (The source code for the latter two primer selection programs may also be obtained from 
their respective sources and modified to meet the user's specific needs.) The PrimeGen program 
(available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge 
UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that 
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hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences. 
Hence, this program is useful for identification of both unique and conserved oligonucleotides and 
polynucleotide fragments. The oligonucleotides and polynucleotide fragments identified by any of the 
above selection methods are useful in hybridization technologies, for example, as PCR or sequencing 
5 primers, microarray elements, or specific probes to identify fully or partially complementary 

polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are not limited to 
those described above. 

A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 
that is made by an artificial combination of two or more otherwise separated segments of sequence. 
10 This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as those described in Sambrook, supra . The to recombinant includes nucleic acids that have 
been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. ' Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. 
15 Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 

A "regulatory element" refers to a nucleic acid sequence usually derived from untranslated 
20 regions of a gene and includes enhancers, promoters, introns, and 5' and 3' untranslated regions (UTRs). 
Regulatory elements interact with host or viral proteins which control transcription, translation, or RNA 
stability. 

"Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
25 chemiluminescent, or chromogenic agents; substrates; cof actors; inhibitors; magnetic particles; and 
other moieties known in the art. 

An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear 
sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 
30 instead of deoxyribose. 

The term "sample" is used in its broadest sense. A sample suspected of containing TRICH, 
nucleic acids encoding TRICH, or fragments thereof may comprise a bodily fluid; an extract from a 
cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, 
in solution or bound to a substrate; a tissue; a tissue print; etc. 
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The terms "specific binding" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
synthetic binding composition. The interaction is dependent upon the presence of a particular structure 
of the protein, e.g., the antigenic determinant or epitope, recognized by the binding molecule. For 
5 example, if an antibody is specific for epitope "A," the presence of a polypeptide comprising the epitope 
A, or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody will 
reduce the amount of labeled A that binds to the antibody. 

The term "substantially purified" refers to nucleic acid or amino acid sequences that are 
removed from their natural environment and are isolated or separated, and are at least 60% free, 
10 preferably at least 75% free, and most preferably at least 90% free from other components with which 
they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides by 
different amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
15 chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 

microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A "transcript image" refers to the collective pattern of gene expression by a particular cell type 
or tissue under given conditions at a given time. 
20 "Transformation" describes a process by which exogenous DNA is introduced into a recipient 

cell. Transformation may occur under natural or artificial conditions according to various methods well 
known in the art, and may rely on any known method for the insertion of foreign nucleic acid sequences 
into a prokaryotic or eukaryotic host cell. The method for transformation is selected based on the type 
of host cell being transformed and may include, but is not limited to, bacteriophage or viral infection, 
25 electroporation, heat shock, lipofection, and particle bombardment. The term "transformed cells" 
includes stably transformed cells in which the inserted DNA is capable of replication either as an 
autonomously replicating plasmid or as part of the host chromosome, as well as transiently transformed 
cells which express the inserted DNA or RNA for limited periods of time. 

A "transgenic organism," as used herein, is any organism, including but not limited to 
30 animals and plants, in which one or more of the cells of the organism contains heterologous nucleic 
acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor 
of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
a recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in 
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vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 
transgenic organisms contemplated in accordance with the present invention include bacteria, 
cyanobacteria, fungi, plants and animals. The isolated DNA of the present invention can be 
introduced into the host by methods known in the art, for example infection, transfection, 
5 transformation or transconjugation. Techniques for transferring the DNA of the present invention 
into such organisms are widely known and provided in references such as Sambrook et al. (1989), 
supra . 

A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having at 
least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of the 

10 nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) 
set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at least 
60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95% or at least 98% or greater 
sequence identity over a certain defined length. A variant may be described as, for example, an "allelic" 
(as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have significant 

15 identity to a reference molecule, but will generally have a greater or lesser number of polynucleotides 
due to alternative splicing of exons during mRNA processing. The corresponding polypeptide may 
possess additional functional domains or lack domains that are present in the reference molecule. 
Species variants are polynucleotide sequences that vary from one species to another. The resulting 
polypeptides will generally have significant amino acid identity relative to each other. A polymorphic 

20 variant is a variation in the polynucleotide sequence of a particular gene between individuals of a given 
species. Polymorphic variants also may encompass "single nucleotide polymorphisms" (SNPs) in 
which the polynucleotide sequence varies by one nucleotide base. The presence of SNPs may be 
indicative of, for example, a certain population, a disease state, or a propensity for a disease state. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at 

25 least 40% sequence identity to the particular polypeptide sequence over a certain length of one of the 
polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) 
set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at least 
60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

30 

THE INVENTION 

The invention is based on the discovery of new human transporters and ion channels (TRICH), 
the polynucleotides encoding TRICH, and the use of these compositions for the diagnosis, treatment, or 
prevention of transport, neurological, muscle, and immunological disorders. 
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Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
sequences of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a 
single Incyte project identification number (Incyte Project ID). Each polypeptide sequence is denoted 
by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an Incyte 
polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide sequence is 
denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID NO:) and an 
Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. 

Table 2 shows sequences with homology to the polypeptides of the invention as identified by 
BLAST analysis against the GenBank protein (genpept) database. Columns 1 and 2 show the 
polypeptide sequence identification number (Polypeptide SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. Column 3 
shows the GenBank identification number (Genbank ID NO:) of the nearest GenBank homolog. 
Column 4 shows the probability score for the match between each polypeptide and its GenBank 
homolog. Column 5 shows the annotation of the GenBank homolog along with relevant citations where 
applicable, all of which are expressly incorporated by reference herein. 

Table 3 shows various structural features of each of the polypeptides of the invention. Columns 
1 and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding 
Incyte polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. 
Column 3 shows the number of amino acid residues in each polypeptide. Column 4 shows potential 
phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the MOTIFS 
program of the GCG sequence analysis software package (Genetics Computer Group, Madison WI). 
Column 6 shows amino acid residues comprising signature sequences, domains, and motifs. Column 7 
shows analytical methods for protein structure/function analysis and in some cases, searchable 
databases to which the analytical methods were applied. 

As shown in Table 4, the full length polynucleotide sequences of the present invention were 
assembled using cDN A sequences or coding (exon) sequences derived from genomic DN A, or any 
combination of these two types of sequences. Columns 1 and 2 list the polynucleotide sequence 
identification number (Polynucleotide SEQ ID NO:) and the corresponding Incyte polynucleotide 
consensus sequence number (Incyte Polynucleotide ID) for each polynucleotide of the invention. 
Column 3 shows the length of each polynucleotide sequence in basepairs. Column 4 lists fragments of 
the polynucleotide sequences which are useful, for example, in hybridization or amplification 
technologies that identify SEQ ID NO:28-54 or that distinguish between SEQ ID NO:28-54 and 
related polynucleotide sequences. Column 5 shows identification numbers corresponding to cDNA 
sequences, coding sequences (exons) predicted from genomic DNA, and/or sequence assemblages 
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comprised of both cDNA and genomic DNA. These sequences were used to assemble the full length 
polynucleotide sequences of the invention. Columns 6 and 7 of Table 4 show the nucleotide start (5') 
and stop (3') positions of the cDNA and genomic sequences in column 5 relative to their respective full 
length sequences . 

5 The identification numbers in Column 5 of Table 4 may refer specifically, for example, to 

Incyte cDNAs along with their corresponding cDNA libraries. For example, 6813453H1 is the 
identification number of an Incyte cDNA sequence, and ADRETUR01 is the cDNA library from which 
it is derived. Incyte cDNAs for which cDNA libraries are not indicated were derived from pooled 
cDNA libraries (e.g., 70207988V1). Alternatively, the identification numbers in column 5 may refer to 

10 GenBank cDNAs or ESTs (e.g., gl947104) which contributed to the assembly of the full length 

polynucleotide sequences. Alternatively, the identification numbers in column 5 may refer to coding 
regions predicted by Genscan analysis of genomic DNA. For example, GNN.g6554406_006 is the 
identification number of a Genscan-predicted coding sequence, with g6554406 being the GenBank 
identification number of the sequence to which Genscan was applied. The Genscan-predicted coding 

15 sequences may have been edited prior to assembly. (See Example IV.) Alternatively, the identification 
numbers in column 5 may refer to assemblages of both cDNA and Genscan-predicted exons brought 
together by an "exon stitching" algorithm. (See Example V.) Alternatively, the identification numbers 
in column 5 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by 
an "exon-stretching" algorithm. (See Example V.) In some cases, Incyte cDNA coverage redundant 

20 with the sequence coverage shown in column 5 was obtained to confirm the final consensus 
polynucleotide sequence, but the relevant Incyte cDNA identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotide sequences 
which were assembled using Incyte cDNA sequences. The representative cDNA library is the Incyte 
cDNA library which is most frequently represented by the Incyte cDNA sequences which were used to 

25 assemble and confirm the above polynucleotide sequences. The tissues and vectors which were used to 
construct the cDNA libraries shown in Table 5 are described in Table 6. 

The invention also encompasses TRICH variants. A preferred TRICH variant is one which has 
at least about 80%, or alternatively at least about 90%, or even at least about 95% amino acid sequence 
identity to the TRICH amino acid sequence, and which contains at least one functional or structural 

30 characteristic of TRICH. 

The invention also encompasses polynucleotides which encode TRICH. In a particular 
embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected from 
the group consisting of SEQ ED NO:28-54, which encodes TRICH. The polynucleotide sequences of 
SEQ ID NO:28-54, as presented in the Sequence Listing, embrace the equivalent RNA sequences, 
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wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone 
is composed of ribose instead of deoxyribose. 

The invention also encompasses a variant of a polynucleotide sequence encoding TRICH. In 
particular, such a variant polynucleotide sequence will have at least about 70%, or alternatively at least 
5 about 85%, or even at least about 95% polynucleotide sequence identity to the polynucleotide sequence 
encoding TRICH. A particular aspect of the invention encompasses a variant of a polynucleotide 
sequence comprising a sequence selected from the group consisting of SEQ ID NO:28-54 which has at 
least about 70%, or alternatively at least about 85%, or even at least about 95% polynucleotide 
sequence identity to a nucleic acid sequence selected from the group consisting of SEQ ID NO:28-54. 

10 Any one of the polynucleotide variants described above can encode an amino acid sequence which 

contains at least one functional or structural characteristic of TRICH. > 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the genetic 
code, a multitude of polynucleotide sequences encoding TRICH, some bearing minimal similarity to the 
polynucleotide sequences of any known and naturally occurring gene, may be produced. Thus, the 

15 invention contemplates each and every possible variation of polynucleotide sequence that could be made 
by selecting combinations based on possible codon choices. These combinations are made in 
accordance with the standard triplet genetic code as applied to the polynucleotide sequence of naturally 
occurring TRICH, and all such variations are to be considered as being specifically disclosed. 

Although nucleotide sequences which encode TRICH and its variants are generally capable of 

20 hybridizing to the nucleotide sequence of the naturally occurring TRICH under appropriately selected 
conditions of stringency, it may be advantageous to produce nucleotide sequences encoding TRICH or 
its derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally 
occurring codons. Codons may be selected to increase the rate at which expression of the peptide 
occurs in a particular prokaryotic or eukaryotic host in accordance with the frequency with which 

25 particular codons are utilized by the host. Other reasons for substantially altering the nucleotide 
sequence encoding TRICH and its derivatives without altering the encoded amino acid sequences 
include the production of RNA transcripts having more desirable properties, such as a greater half-life, 
than transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of DNA sequences which encode TRICH and 

30 TRICH derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the 

synthetic sequence may be inserted into any of the many available expression vectors and cell systems 
using reagents well known in the art. Moreover, synthetic chemistry may be used to introduce 
mutations into a sequence encoding TRICH or any fragment thereof. 

Also encompassed by the invention are polynucleotide sequences that are capable of 
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hybridizing to the claimed polynucleotide sequences, and, in particular, to those shown in SEQ ID 
NO:28-54 and fragments thereof undo: various conditions of stringency. (See, e.g., Wahl, G.M. and 
S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 
152:507-511.) Hybridization conditions, including annealing and wash conditions, are described in 
5 "Definitions." 

Methods for DNA sequencing are well known in the art and may be used to practice any of the 
embodiments of the invention. The methods may employ such enzymes as the Klenow fragment of 
DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (PE 
Biosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech, Piscataway NJ), or 

10 combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE 
amplification system (Life Technologies, Gaithersburg MD). Preferably, sequence preparation is 
automated with machines such as the MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), 
PTC200 thermal cycler (MJ Research, Watertown MA) and ABI CATALYST 800 thermal cycler (PE 
Biosystems). Sequencing is then carried out using either the ABI 373 or 377 DNA sequencing system 

15 (PE Biosystems), the MEGABACE 1000 DNA sequencing system (Molecular Dynamics, Sunnyvale 
C A), or other systems known in the art. The resulting sequences are analyzed using a variety of 
algorithms which are well known in the art. (See, e.g., Ausubel, F.M. (1997) Short Protocols in 
Molecular Biology , John Wiley & Sons, New York NY, unit 7.7; Meyers, R. A. (1995) Molecular 
Biology and Biotechnology . Wiley VCH, New York NY, pp. 856-853.) 

20 The nucleic acid sequences encoding TRICH may be extended utilizing a partial nucleotide 

sequence and employing various PCR-based methods known in the art to detect upstream sequences, 
such as promoters and regulatory elements. For example, one method which may be employed, 
restriction-site PCR, uses universal and nested primers to amplify unknown sequence from genomic 
DNA within a cloning vector. (See, e.g., Sarkar, G. (1993) PCR Methods Applic. 2:318-322.) 

25 Another method, inverse PCR, uses primers that extend in divergent directions to amplify unknown 

sequence from a circularized template. The template is derived from restriction fragments comprising a 
known genomic locus and surrounding sequences. (See, e.g., Triglia, T. et al. (1988) Nucleic Acids 
Res. 16:8186.) A third method, capture PCR, involves PCR amplification of DNA fragments adjacent 
to known sequences in human and yeast artificial chromosome DNA. (See, e.g., Lagerstrom, M. et al. 

30 (1991) PCR Methods Applic. 1:111-1 19.) In this method, multiple restriction enzyme digestions and 
ligations may be used to insert an engineered double-stranded sequence into a region of unknown 
sequence before performing PCR. Other methods which may be used to retrieve unknown sequences 
are known in the art. (See, e.g., Parker, J.D. et al. (1991) Nucleic Acids Res. 19:3055-3060). 
Additionally, one may use PCR, nested primers, and PROMOTERFINDER libraries (Clontech, Palo 
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Alto C A) to walk genomic DNA. This procedure avoids the need to screen libraries and is useful in 
finding intron/exon junctions. For all PCR-based methods, primers may be designed using 
commercially available software, such as OLIGO 4.06 primer analysis software (National Biosciences, 
Plymouth MN) or another appropriate program, to be about 22 to 30 nucleotides in length, to have a 
5 GC content of about 50% or more, and to anneal to the template at temperatures of about 68°C to 
72°C. 

When screening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 
sequences containing the 5' regions of genes, are preferable for situations in which an oligo d(T) library 
10 does not yield a full-length cDNA. Genomic libraries may be useful for extension of sequence into 5* 
non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze the 
size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary 
sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide- 
15 specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, PE Biosystems), and the entire process 
from loading of samples to computer analysis and electronic data display may be computer controlled. 
Capillary electrophoresis is especially preferable for sequencing small DNA fragments which may be 
20 present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotide sequences or fragments thereof which 
encode TRICH may be cloned in recombinant DNA molecules that direct expression of TRICH, or 
fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy of 
the genetic code, other DNA sequences which encode substantially the same or a functionally equivalent 
25 amino acid sequence may be produced and used to express TRICH. 

The nucleotide sequences of the present invention can be engineered using methods generally 
known in the art in order to alter TRICH-encoding sequences for a variety of purposes including, but 
not limited to, modification of the cloning, processing, and/or expression of the gene product. DNA 
shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic 
30 oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleotide- 

mediated site-directed mutagenesis may be used to introduce mutations that create new restriction sites, 
alter glycosylation patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of the present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 
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5,837,458; Chang, C.-C. et al. (1999) Nat Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat Biotechnol. 14:315-319) to alter or 
improve the biological properties of TRICH, such as its biological or enzymatic activity or its ability 
to bind to other molecules or compounds. DNA shuffling is a process by which a library of gene 
variants is produced using PCR-mediated recombination of gene fragments. The library is then 
subjected to selection or screening procedures that identify those gene variants with the desired 
properties. These preferred variants may then be pooled and further subjected to recursive rounds of 
DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial" 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and then reshuffled until the desired properties are 
optimized. Alternatively, fragments of a given gene may be recombined with fragments of 
homologous genes in the same gene family, either from the same or different species, thereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
manner. 

In another embodiment, sequences encoding TRICH may be synthesized, in whole or in part, 
using chemical methods well known in the art. (See, e.g., Caruthers, M.H. et al. (1980) Nucleic Acids 
Symp. Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232.) Alternatively, 
TRICH itself or a fragment thereof may be synthesized using chemical methods. For example, peptide 
synthesis can be performed using various solution-phase or solid-phase techniques. (See, e.g. , 
Creighton, T. (1984) Proteins, Structures and Molecular Properties , WH Freeman, New York NY, 
pp.55-60; and Roberge, J.Y. et al. (1995) Science 269:202-204.) Automated synthesis may be 
achieved using the ABI 431 A peptide synthesizer (PE Biosystems). Additionally, the amino acid 
sequence of TRICH, or any part thereof, may be altered during direct synthesis and/or combined with 
sequences from other proteins, or any part thereof, to produce a variant polypeptide or a polypeptide 
having a sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative high performance liquid 
chromatography. (See, e.g., Chiez, R.M. andF.Z. Regnier (1990) Methods Enzymol. 182:392-421.) 
The composition of the synthetic peptides may be confirmed by amino acid analysis or by sequencing. 
(See, e.g., Creighton, supra , pp. 28-53.) 

In order to express a biologically active TRICH, the nucleotide sequences encoding TRICH or 
derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translational control of the inserted coding sequence in a 
suitable host. These elements include regulatory sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3* untranslated regions in the vector and in polynucleotide sequences 
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encoding TRICH. Such elements may vary in their strength and specificity. Specific initiation signals 
onay also be used to achieve more efficient translation of sequences encoding TRICH. Such signals 
include the ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where 
sequences encoding TRICH and its initiation codon and upstream regulatory sequences are inserted into 
5 the appropriate expression vector, no additional transcriptional or translational control signals may be 
needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous 
translational control signals including an in-frame ATG initiation codon should be provided by the 
vector. Exogenous translational elements and initiation codons may be of various origins, both natural 
and synthetic. The efficiency of expression may be enhanced by the inclusion of enhancers appropriate 
10 for the particular host cell system used. (See, e.g., Scharf, D. et al. (1994) Results Probl. Cell Differ. 
20:125-162.) 

Methods-which are well known to those skilled in the art may be used to construct expression 
vectors containing sequences encoding TRICH and appropriate transcriptional and translational control 
elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in 

15 vivo genetic recombination. (See, e.g., Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory 
Manual, Cold Spring Harbor Press, Plainview NY, ch. 4, 8, and 16-17; Ausubel, FlM. et al. (1995) 
Current Protocols in Molecular Biology . John Wiley & Sons, New York NY, ch. 9, 13, and 16.) 

A variety of expression vector/host systems may be utilized to contain and express sequences 
encoding TRICH. These include, but are not limited to, microorganisms such as bacteria transformed 

20 with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); 
plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, or 
tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 
animal cell systems. (See, e.g., Sambrook, supra ; Ausubel, supra ; Van Heeke, G. and S.M. Schuster 

25 (1989) J. Biol. Chem. 264:5503-5509; Engelhard, E.K. et al. (1994) Proc. Natl. Acad. Sci. USA 

91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO 
J. 6:307-311; The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New 
York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; and 
Harrington, JJ. et al. (1997) Nat Genet 15:345-355.) Expression vectors derived from retroviruses, 

30 adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 
delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di 
Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al. (1993) Proc. Natl. Acad. Sci. 
USA 90(13):6340-6344; Buller, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. 
(1994) Mol. Immunol. 3 1(3): 21 9-226; and Verma, I.M. andN. Somia(1997) Nature 389:239-242.) 
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The invention is not limited by the host cell employed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intended for polynucleotide sequences encoding TRICH. For example, routine cloning, 
subcloning, and propagation of polynucleotide sequences encoding TRICH can be achieved using a 
5 multifunctional E. coli vector such as PBLUESCRIPT (Stratagene, La Jolla CA) or PSPORT1 plasmid 
(Life Technologies). Ligation of sequences encoding TRICH into the vector's multiple cloning site 
disrupts the lacZ gene, allowing a colorimetric screening procedure for identification of transformed 
bacteria containing recombinant molecules. In addition, these vectors may be useful for in vitro 
transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of nested 

10 deletions in the cloned sequence. (See, e.g., Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 
264:5503-5509.) When large quantities of TRICH are needed, e.g. for the production of antibodies, 
vectors which direct high level expression of TRICH may be used. For example, vectors containing the 
strong, inducible SP6 or T7 bacteriophage promoter may be used. 

Yeast expression systems may be used for production of TRICH. A number of vectors 

15 containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 

promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoris . In addition, such 
vectors direct either the. secretion or intracellular retention of expressed proteins and enable integration 
of foreign sequences into the host genome for stable propagation. (See, e.g., Ausubel, 1995, supra ; 
Bitter, G.A. et al. (1987) Methods Enzymol. 153:516-544; and Scorer, C.A. et al. (1994) 

20 Bio/Technology 12:181-184.) 

Plant systems may also be used for expression of TRICH. Transcription of sequences encoding 
TRICH may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used alone or in 
combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 6:307-31 1). 
Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock promoters may be 

25 used. (See, e.g., Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al. (1984) Science 

224:838-843; and Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105.) These constructs can 
be introduced into plant cells by direct DNA transformation or pathogen-mediated transfection. (See, 
e.g., The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York NY, pp. 
191-196.) 

30 In mammalian cells, a number of viral-based expression systems may be utilized. In cases 

where an adenovirus is used as an expression vector, sequences encoding TRICH may be ligated into an 
adenovirus transcription/translation complex consisting of the late promoter and tripartite leader 
sequence. Insertion in a non-essential El or E3 region of the viral genome may be used to obtain 
infective virus which expresses TRICH in host cells. (See, e.g.> Logan, J. and T. Shenk (1984) Proc. 
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Natl. Acad. Sci. USA 81 :3655-3659.) In addition, transcription enhancers, such as the Rous sarcoma 
virus (RSV) enhancer, may be used to increase expression in mammalian host cells. SV40 or EB V- 
based vectors may also be used for high-level protein expression. 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 
5 DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 

constructed and delivered via conventional delivery methods (liposomes, polycationic amino polymers, 
or vesicles) for therapeutic purposes. (See, e.g., Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355.) 

For long term production of recombinant proteins in mammalian systems, stable expression of 
TRICH in cell lines is preferred. For example, sequences encoding TRICH can be transformed into cell 

10 lines using expression vectors which may contain viral origins of replication and/or endogenous 

expression elements and a selectable marker gene on the same or on a separate vector. Following the 
introduction of the vector, cells may be allowed to grow for about 1 to 2 days in enriched media before 
being switched to selective media. The purpose of the selectable marker is to confer resistance to a 
selective agent, and its presence allows growth and recovery of cells which successfully express the 

15 introduced sequences. Resistant clones of stably transformed cells may be propagated using tissue 
culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These include, 
but are not limited to, the herpes simplex virus thymidine kinase and adenine phosphoribosyltransferase 
genes, for use in tk~ and apr cells, respectively. (See, e.g., Wigler, M. et al. (1977) Cell 1 1:223-232; 

20 Lowy, I. et al. (1980) Cell 22:817-823.) Also, antimetabolite, antibiotic, or herbicide resistance can be 
used as the basis for selection. For example, dhfr confers resistance to methotrexate; neo confers 
resistance to the aminoglycosides neomycin and G-418; and als and pat confer resistance to 
cliorsulfuron and phosphinotricin acetyltransferase, respectively. (See, e.g., Wigler, M. et al. (1980) 
Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 150:1-14.) 

25 Additional selectable genes have been described, e.g., trpB and hisD, which alter cellular requirements 
for metabolites. (See, e.g., Hartman, S.C. and R.C. Mulligan (1988) Proc. Natl. Acad. Sci. USA 
85:8047-8051.) Visible markers, e.g., anthocyanins, green fluorescent proteins (GFP; Clontech), 6 
glucuronidase and its substrate fl-glucuronide, or luciferase and its substrate luciferin may be used. 
These markers can be used not only to identify transformants, but also to quantify the amount of 

30 transient or stable protein expression attributable to a specific vector system. (See, e.g., Rhodes, C.A. 
(1995) Methods Mol. Biol. 55:121-131.) 

Although the presence/absence of marker gene expression suggests that the gene of interest is 
also present, the presence and expression of the gene may need to be confirmed. For example, if the 
sequence encoding TRICH is inserted within a marker gene sequence, transformed cells containing 
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sequences encoding TRICH can be identified by the absence of marker gene function. Alternatively, a 
marker gene can be placed in tandem with a sequence encoding TRICH undo: the control of a single 
promoter. Expression of the marker gene in response to induction or selection usually indicates 
expression of the tandem gene as well. 
5 In general, host cells that contain the nucleic acid sequence encoding TRICH and that express 

TRICH may be identified by a variety of procedures known to those of skill in the art. These 
procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR 
amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or 
chip based technologies for the detection and/or quantification of nucleic acid or protein sequences. 

10 Immunological methods for detecting and measuring the expression of TRICH using either 

specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include 
enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and fluorescence 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal 
antibodies reactive to two non-interfering epitopes on TRICH is preferred, but a competitive binding 

15 assay may be employed. These and other assays are well known in the art. (See, e.g., Hampton, R. et 
al. (1990) Serological Methods, a Laboratory Manual , APS Press, St. Paul MN, Sect. IV; Coligan, J.E. 
et al. (1997) Current Protocols in Immunology , Greene Pub. Associates and Wiley-Interscience, New * 
York NY; and Pound, J.D. (1998) Immunochemical Protocols , Humana Press, Totowa NJ.) 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 

20 may be used in various nucleic acid and amino acid assays. Means for producing labeled hybridization 
or PCR probes for detecting sequences related to polynucleotides encoding TRICH include 
oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. 
Alternatively, the sequences encoding TRICH, or any fragments thereof, may be cloned into a vector for 
the production of an mRNA probe. Such vectors are known in the art, are commercially available, and 

25 may be used to synthesize RNA probes in vitro by addition of an appropriate RN A polymerase such as 
T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety of 
commercially available kits, such as those provided by Amersham Pharmacia Biotech, Promega 
(Madison WI), and US Biochemical. Suitable reporter molecules or labels which may be used for ease 
of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents, as 

30 well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with nucleotide sequences encoding TRICH may be cultured under 
conditions suitable for the expression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intracellularly depending on the sequence 
and/or the vector used. As will be understood by those of skill in the art, expression vectors containing 
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polynucleotides which encode TRICH may be designed to contain signal sequences which direct 
secretion of TRICH through a prokaryotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted sequences or to process the expressed protein in the desired fashion. Such modifications of the 
5 polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, 
lipidation, and acylation. Post-translational processing which cleaves a "prepro" or "pro" form of the 
protein may also be used to specify protein targeting, folding, and/or activity. Different host cells 
which have specific cellular machinery and characteristic mechanisms for post-translational activities 
(e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the American Type Culture 

10 Collection (ATCC, Manassas VA) and may be chosen to ensure the correct modification and processing 
of the foreign protein. 

In another embodiment of the invention, natural, modified, or recombinant nucleic acid 
sequences encoding TRICH may be ligated to a heterologous sequence resulting in translation of a 
fusion protein in any of the aforementioned host systems. For example, a chimeric TRICH protein 

15 containing a heterologous moiety that can be recognized by a commercially available antibody may 
facilitate the screening of peptide libraries for inhibitors of TRICH activity. Heterologous protein and 
peptide moieties may also facilitate purification of fusion proteins using commercially available affinity 
matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), maltose 
binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, and 

20 hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of their cognate fusion 

proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and metal-chelate resins, 
respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of fusion 
proteins using commercially available monoclonal and polyclonal antibodies that specifically recognize 
these epitope tags. A fusion protein may also be engineered to contain a proteolytic cleavage site 

25 located between the TRICH encoding sequence and the heterologous protein sequence, so that TRICH 
may be cleaved away from the heterologous moiety following purification. Methods for fusion protein 
expression and purification are discussed in Ausubel (1995, supra , ch. 10). A variety of commercially 
available kits may also be used to facilitate expression and purification of fusion proteins. 

In a further embodiment of the invention, synthesis of radiolabeled TRICH may be achieved in 

30 vitro using the TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems 
couple transcription and translation of protein-coding sequences operably associated with the T7, T3, or 
SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for 
example, 35 S-methionine. 

TRICH of the present invention or fragments thereof may be used to screen for compounds 
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that specifically bind to TRICH. At least one and up to a plurality of test compounds may be 
screened for specific binding to TRICH. Examples of test compounds include antibodies, 
oligonucleotides, proteins (e.g., receptors), or small molecules. 

In one embodiment, the compound thus identified is closely related to the natural ligand of 
5 TRICH, e.g., a ligand or fragment thereof, a natural substrate, a structural or functional mimetic, or a 
natural binding partner. (See, e.g., Coligan, J.E. et al. (1991) Current Protocols in Immunology 1(2): 
Chapter 5.) Similarly, the compound can be closely related to the natural receptor to which TRICH 
binds, or to at least a fragment of the receptor, e.g., the ligand binding site. In either case, the 
compound can be rationally designed using known techniques. In one embodiment, screening for 
10 these compounds involves producing appropriate cells which express TRICH, either as a secreted 

protein or on the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila , or E. 
coli . Cells expressing TRICH or cell membrane fractions which contain TRICH are then contacted 
with a test compound and binding, stimulation, or inhibition of activity of either TRICH or the 
compound is analyzed. 

15 An assay may simply test binding of a test compound to the polypeptide, wherein binding is 

detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For example, 
the assay may comprise the steps of combining at least one test compound with TRICH, either in t 
solution or affixed to a solid support, and detecting the binding of TRICH to the compound. 
Alternatively, the assay may detect or measure binding of a test compound in the presence of a 

20 labeled competitor. Additionally, the assay may be carried out using cell-free preparations, chemical 
libraries, or natural product mixtures, and the test compound(s) may be free in solution or affixed to a 
solid support. 

TRICH of the present invention or fragments thereof may be used to screen for compounds 
that modulate the activity of TRICH. Such compounds may include agonists, antagonists, or partial 

25 or inverse agonists. In one embodiment, an assay is performed under conditions permissive for TRICH 
activity, wherein TRICH is combined with at least one test compound, and the activity of TRICH in the 
presence of a test compound is compared with the activity of TRICH in the absence of the test 
compound. A change in the activity of TRICH in the presence of the test compound is indicative of a 
compound that modulates the activity of TRICH. Alternatively, a test compound is combined with an 

30 in vitro or cell-free system comprising TRICH under conditions suitable for TRICH activity, and the 
assay is performed. In either of these assays, a test compound which modulates the activity of TRICH 
may do so indirectly and need not come in direct contact with the test compound. At least one and up to 
a plurality of test compounds may be screened. 

In another embodiment, polynucleotides encoding TRICH or their mammalian homologs may 

35 be "knocked out" in an animal model system using homologous recombination in embryonic stem (ES) 
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cells. Such techniques are well known in the art and are useful for the generation of animal models of 
human disease. (See, e.g., U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For 
example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo 
and grown in culture. The ES cells are transformed with a vector containing the gene of interest 
5 disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, MR. (1989) 
Science 244:1288-1292). The vector integrates into the corresponding region of the host genome by 
homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP 
system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. 
(1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et sL (1997) Nucleic Acids Res. 25:4323-4330). 

10 Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and the 
resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains. 
Transgenic animals thus generated may be tested with potential therapeutic or toxic agents. 

Polynucleotides encoding TRICH may also be manipulated in vitro in ES cells derived from 

15 human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J. A. et al. (1998) 
Science 282: 1 145-1 147). 

Polynucleotides encoding TRICH can also be used to create "knockin" humanized animals 

20 (pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region 
of a polynucleotide encoding TRICH is injected into animal ES cells, and the injected sequence 
integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae 
are implanted as described above. Transgenic progeny or inbred lines are studied and treated with 
potential pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, a 

25 mammal inbred to overexpress TRICH, e.g., by secreting TRICH in its milk, may also serve as a 
convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
between regions of TRICH and transporters and ion channels. Therefore, TRICH appears to play a 
30 role in transport, neurological, muscle, and immunological disorders. In the treatment of disorders 
associated with increased TRICH expression or activity, it is desirable to decrease the expression or 
activity of TRICH. In the treatment of disorders associated with decreased TRICH expression or 
activity, it is desirable to increase the expression or activity of TRICH. 

Therefore, in one embodiment, TRICH or a fragment or derivative thereof may be 
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administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of TRICH. Examples of such disorders include, but are not limited to, a transport disorder 
such as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia, cystic fibrosis, Becker's 
muscular dystrophy, Bell's palsy, Charcot-Marie Tooth disease, diabetes mellitus, diabetes insipidus, 
5 diabetic neuropathy, Duchenne muscular dystrophy, hyperkalemic periodic paralysis, normokalemic 
periodic paralysis, Parkinson's disease, malignant hyperthermia, multidrug resistance, myasthenia 
gravis, myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheral neuropathy, cerebral 
neoplasms, prostate cancer, cardiac disorders associated with transport, e.g., angina, bradyarrythmia, 
tachyarrythmia, hypertension, Long QT syndrome, myocarditis, cardiomyopathy, nemaline 

10 myopathy, centronuclear myopathy, lipid myopathy, mitochondrial myopathy, thyrotoxic myopathy, 
ethanol myopathy, dermatomyositis, inclusion body myositis, infectious myositis, polymyositis, 
neurological disorders associated with transport, e.g., Alzheimer's disease, amnesia, bipolar disorder, 
dementia, depression, epilepsy, Tourette's disorder, paranoid psychoses, and schizophrenia, and other 
disorders associated with transport, e.g., neurofibromatosis, postherpetic neuralgia, trigeminal 

15 neuropathy, sarcoidosis, sickle cell anemia, Wilson's disease, cataracts, infertility, pulmonary artery 
stenosis, sensorineural autosomal deafness, hyperglycemia, hypoglycemia, Grave's disease, goiter, 
Cushing's disease, Addison's disease, glucose-galactose malabsorption syndrome, 
hypercholesterolemia, adrenoleukodystrophy, Zellweger syndrome, Menkes disease, occipital horn 
syndrome, von Gierke disease, cystinuria, iminoglycinuria, Hartup disease, and Fanconi disease; a 

20 neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, 
Alzheimer's disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other 
extrapyramidal disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive 
neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other 
demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural 

25 abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous 
system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and Gerstmann- 
Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the 
nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, 
encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central 

30 nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic 

nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other 
neuromuscular disorders, peripheral nervous system disorders, dermatomyositis and polymyositis, 
inherited, metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental 
disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), 

35 akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, 
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postherpetic neuralgia, Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, 
and familial frontotemporal dementia; a muscle disorder such as cardiomyopathy, myocarditis, 
Duchenne's muscular dystrophy, Becker's muscular dystrophy, myotonic dystrophy, central core 
disease, nemaline myopathy, centronuclear myopathy, lipid myopathy, mitochondrial myopathy, 
5 infectious myositis, polymyositis, dermatomyositis, inclusion body myositis, thyrotoxic myopathy, 
ethanol myopathy, angina, anaphylactic shock, arrhythmias, asthma, cardiovascular shock, Cushing's 
syndrome, hypertension, hypoglycemia, myocardial infarction, migraine, pheochromocytoma, and 
myopathies including encephalopathy, epilepsy, Keams-Sayre syndrome, lactic acidosis, myoclonic 
disorder, ophthalmoplegia, and acid maltase deficiency (AMD, also known as Pompe's disease); and 

10 an immunological disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, 
adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, 
atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune 
polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact 
dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, 

15 episodic lymphopenia with lymphocytotoxins, er ythroblastosis fetalis, erythema nodosum, atrophic 
gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's 
thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, 
psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic 

20 anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative 
colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal 
circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma. 

In another embodiment, a vector capable of expressing TRICH or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 

25 expression or activity of TRICH including, but not limited to, those described above. 

In a further embodiment, a composition comprising a substantially purified TRICH in 
conjunction with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent a 
disorder associated with decreased expression or activity of TRICH including, but not limited to, those 
provided above. 

30 In still another embodiment, an agonist which modulates the activity of TRICH may be 

administered to a subject to treat or prevent a disorder associated with decreased expression or activity 
of TRICH including, but not limited to, those listed above. 

In a further embodiment, an antagonist of TRICH may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of TRICH. Examples of such 
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disorders include, but are not limited to, those transport, neurological, muscle, and immunological 
disorders described above. In one aspect, an antibody which specifically binds TRICH may be used 
directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express TRICH. 
5 In an additional embodiment, a vector expressing the complement of the polynucleotide 

encoding TRICH may be administered to a subject to treat or prevent a disorder associated with 
increased expression or activity of TRICH including, but not limited to, those described above. 

In other embodiments, any of the proteins, antagonists, antibodies, agonists, complementary 
sequences, or vectors of the invention may be administered in combination with other appropriate 

10 therapeutic agents. Selection of the appropriate agents for use in combination therapy may be made by 
one of ordinary skill in the art, according to conventional pharmaceutical principles. The combination 
of therapeutic agents may act synergistically to effect the treatment or prevention of the various 
disorders described above. Using this approach, one may be able to achieve therapeutic efficacy with 
lower dosages of each agent, thus reducing the potential for adverse side effects. 

15 An antagonist of TRICH may be produced using methods which are generally known in the art. 

In particular, purified TRICH may be used to produce antibodies or to screen libraries of 
pharmaceutical agents to identify those which specifically bind TRICH. Antibodies to TRICH may 
also be generated using methods that are well known in the art. Such antibodies may include, but are 
not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and 

20 fragments produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit 
dimer formation) are generally preferred for therapeutic use. 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, humans, 
and others may be immunized by injection with TRICH or with any fragment or oligopeptide thereof 
which has immunogenic properties. Depending on the host species, various adjuvants may be used to 

25 increase immunological response. Such adjuvants include, but are not limited to, Freund's, mineral gels 
such as aluminum hydroxide, and surface active substances such as lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, KLH, and dinitrophenol. Among adjuvants used in humans, BCG 
(bacilli Calmette-Guerin) and Corvnebacterium parvum are especially preferable. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to 

30 TRICH have an amino acid sequence consisting of at least about 5 amino acids, and generally will 
consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or 
fragments are identical to a portion of the amino acid sequence of the natural protein. Short stretches of 
TRICH amino acids may be fused with those of another protein, such as KLH, and antibodies to the 
chimeric molecule may be produced 
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Monoclonal antibodies to TRICH may be prepared using any technique which provides for the 
production of antibody molecules by continuous cell lines in culture. These include, but are not limited 
to, the hybridoma technique, the human B-cell hybridoma technique, and the EB V-hybridoma 
technique. (See, e.g., Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. 
5 Immunol. Methods 81:31-42; Cote, RJ. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; and 
Cole, S.P. et al. (1984) Mol. Cell Biol. 62:109-120.) 

In addition, techniques developed for the production of "chimeric antibodies," such as the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used. (See, e.g., Morrison, S.L. et al. (1984) Proc. 

10 Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; and Takeda, 
S. et al. (1985) Nature 314:452-454.) Alternatively, techniques described for the production of single 
chain antibodies may be adapted, using methods known in the art, to produce TRICH-specific single 
chain antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be 
generated by chain shuffling from random combinatorial immunoglobulin libraries. (See, e.g. , Burton, 

15 D.R. (1991) Proc. Natl. Acad. Sci. USA 88:10134-10137.) 

Antibodies may also be produced by inducing in vivo production in the lymphocyte population 
or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in 
the literature. (See, e.g., Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. USA 86:3833-3837; Winter, 
G. et al. (1991) Nature 349:293-299.) 

20 Antibody fragments which contain specific binding sites for TRICH may also be generated. 

For example, such fragments include, but are not limited to, F(ab*) 2 fragments produced by pepsin 
digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of the 
F(ab')2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and easy 
identification of monoclonal Fab fragments with the desired specificity. (See, e.g., Huse, W.D. et al. 

25 (1989) Science 246:1275-1281.) 

Various immunoassays may be used for screening to identify antibodies having the desired 
specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
polyclonal or monoclonal antibodies with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of complex formation between TRICH and its 

30 specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive 
to two non-interfering TRICH epitopes is generally used, but a competitive binding assay may also be 
employed (Pound, supra) . 

Various methods such as Scatchard analysis in conjunction with radioimmunoassay techniques 
may be used to assess the affinity of antibodies for TRICH. Affinity is expressed as an association 
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constant, which is defined as the molar concentration of TRICH- antibody complex divided by the 
molar concentrations of free antigen and free antibody under equilibrium conditions. The determined 
for a preparation of polyclonal antibodies, which are heterogeneous in their affinities for multiple 
TRICH epitopes, represents the average affinity, or avidity, of the antibodies for TRICH. The Ka 
5 determined for a preparation of monoclonal antibodies, which are monospecific for a particular TRICH 
epitope, represents a true measure of affinity. High-affinity antibody preparations with Ka ranging from 
about 10 9 to 10 12 L/mole are preferred for use in immunoassays in which the TRICH-antibody complex 
must withstand rigorous manipulations. Low-affinity antibody preparations with Ka ranging from 
about 10 6 to 10 7 L/mole are preferred for use in immunopurification and similar procedures which 

10 ultimately require dissociation of TRICH, preferably in active form, from the antibody (Catty, D. 

(1988) Antibodies, Volume I: A Practical Approach , IRL Press, Washington DC; Liddell, J.E. and A. 
Cryer (1991) A Practical Guide to Monoclonal Antibodies, John Wiley & Sons, New York NY). 

The titer and avidity of polyclonal antibody preparations may be further evaluated to determine 
the quality and suitability of such preparations for certain downstream applications. For exampie, a 

15 polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, preferably 5-10 mg 
specific antibody/ml, is generally employed in procedures requiring precipitation of TRICH-antibody 
complexes. Procedures for evaluating antibody specificity, titer, and avidity, and guidelines for 
antibody quality and usage in various applications, are generally available. (See, e.g., Catty, supra , and 
Coligan et al. supra .) 

20 In another embodiment of the invention, the polynucleotides encoding TRICH, or any fragment 

or complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, RNA, 
PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding TRICH. 
Such technology is well known in the art, and antisense oligonucleotides or larger fragments can be 

25 designed from various locations along the coding or control regions of sequences encoding TRICH. 
(See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press Inc., TotawaNJ.) 

In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 

30 complementary to at least a portion of the cellular sequence encoding the target protein. (See, e.g., 
Slater, J.E. et al. (1998) J. Allergy Cli. Immunol. 102(3):469-475; and Scanlon, K.J. et al. (1995) 
9(13):1288-1296.) Antisense sequences can also be introduced intracellularly through the use of viral 
vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g., Miller, A.D. (1990) Blood 
76:271; Ausubel, supra : Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other 
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gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other 
systems known in the art. (See, e.g., Rossi, J.J. (1995) Br. Med. Bull. 51(l):217-225; Boado, RJ. et 
al. (1998) J. Pharm. Sci. 87(1 1):1308-1315; and Morris, M.C. et al. (1997) Nucleic Acids Res. 
25(14):2730-2736.) 

5 In another embodiment of the invention, polynucleotides encoding TRICH may be used for 

somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCED)-Xl disease characterized by X-linked 
inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 

10 (Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 
cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 
Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial 
hypercholesterolemia, and hemophilia resulting from Factor VUI or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, I.M. and N. Somia (1997) Nature 389:239-242)), (ii) 

15 express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., 
against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. (1988) 
Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA. 93:11395-11399), 
hepatitis B or C virus (HB V, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 

20 brasiliensis : and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi). In the 
case where a genetic deficiency in TRICH expression or regulation causes disease, the expression of 
TRICH from an appropriate population of transduced cells may alleviate the clinical manifestations 
caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in 

25 TRICH are treated by constructing mammalian expression vectors encoding TRICH and introducing 
these vectors by mechanical means into TRICH-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic 
gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) 
the use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. Biochem. 62:191- 

30 217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. and H. R6cipon (1998) Curr. Opin. Biotechnol. 
9:445-450). 

Expression vectors that may be effective for the expression of TRICH include, but are not 
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), 
PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), and PTET-OFF, 
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PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). TRICH may be expressed 
using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma virus 
(RSV) S SV40 virus, thymidine kinase (TK), or P-actin genes), (ii) an inducible promoter (e.g., the 
tetracycline-regulated promote (Gossen, M. and H. Bujard (1992) Proc. Natl. Acad. Sci. U.S.A. 
5 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, F.M.V. and H.M. Blau (1998) 
Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid (Invitrogen)); the 
ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; Invitrogen); the 
FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, F.M.V. 
and Blau, H.M. supra) ), or (iii) a tissue-specific promoter or the native promoter of the endogenous 

10 gene encoding TRICH from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 

15 (Graham, F.L. and A.J. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982) EMBO J. 1 :841-845). The introduction of DNA to primary cells requires modification of these 
standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to TRICH expression are treated by constructing a retrovirus vector consisting of (i) the 

20 polynucleotide encoding TRICH under the control of an independent promote or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus as-acting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 
commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. 

25 Natl. Acad. Sci. U.S.A. 92:6733-6737), incorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
receptors on the target cells or a promiscuous envelope protein such as VS Vg (Armentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and 
A.D. Miller (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. 

30 et al. (1998) J. Virol. 72:9873-9880). U.S. Patent Number 5,910,434 to Rigg ("Method for obtaining 
retrovirus packaging cell lines producing high transducing efficiency retroviral supernatant") discloses a 
method for obtaining retrovirus packaging cell lines and is hereby incorporated by reference. 
Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4 + T-cells), and the 
return of transduced cells to a patient are procedures well known to persons skilled in the art of gene 
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therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et 
al. (1997) Blood 89:2259-2267; Bonyhadi, MX. (1997) J. Virol. 71:4707-4716; Ranga, U. et al. 
(1998) Proc. Natl. Acad. Sci U.S.A. 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

In the alternative, an adenovirus-based gene therapy delivery system is used to deliver 
5 polynucleotides encoding TRICH to cells which have one or more genetic abnormalities with respect to 
the expression of TRICH. The construction and packaging of adenovirus-based vectors are well known 
to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be 
versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 

10 described in U.S. Patent Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P. A. et al. (1999) Annu. 
Rev. Nutr. 19:511-544 and Verma, I.M. andN. Somia (1997) Nature 18:389:239-242, both 
incorporated by reference herein. 

In another alternative, a herpes-based, gene therapy delivery system is used to deliver 

15 polynucleotides encoding TRICH to target cells which have one or more genetic abnormalities with 
respect to the expression of TRICH. The use of herpes simplex virus (HS V)-based vectors may be 
especially valuable for introducing TRICH to cells of the central nervous system, for which HSV has a 
tropism. The construction and packaging of herpes-based vectors are well known to those with 
ordinary skill in the art. A replication-competent herpes simplex virus (HSV) type 1 -based vector has 

20 been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye 

Res. 169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent Number 5,804,413 to DeLuca ("Herpes simplex virus strains for gene transfer"), which is 
hereby incorporated by reference. U.S. Patent Number 5,804,413 teaches the use of recombinant HSV 
d92 which consists of a genome containing at least one exogenous gene to be transferred to a cell under 

25 the control of the appropriate promoter for purposes including human gene therapy. Also taught by this 
patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. 
For HSV vectors, see also Goins, W.F. et al. (1999) J. Virol. 73:519-532 andXu, H. et al. (1994) Dev. 
Biol. 163:152-161, hereby incorporated by reference. The manipulation of cloned herpesvirus 
sequences, the generation of recombinant virus following the transfection of multiple plasmids 

30 containing different segments of the large herpesvirus genomes, the growth and propagation of 

herpesvirus, and the infection of cells with herpesvirus are techniques well known to those of ordinary 
skill in the art. 

In another alternative, an alphavirus (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding TRICH to target cells. The biology of the prototypic alphavirus, 
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Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based on 
the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. Biotechnol. 9:464-469). During 
alphavirus RNA replication, a subgenomic RN A is generated that normally encodes the viral capsid 
proteins. This subgenomic RNA replicates to higher levels than the full length genomic RNA, resulting 
5 in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g., 
protease and polymerase). Similarly, inserting the coding sequence for TRICH into the alphavirus 
genome in place of the capsid-coding region results in the production of a large number of TRICH- 
coding RNAs and the synthesis of high levels of TRICH in vector transduced cells. While alphavirus 
infection is typically associated with cell lysis within a few days, the ability to establish a persistent 

10 infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) indicates that 
the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy application 
(Dryga, S.A et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will allow the 
introduction of TRICH into a variety of cell types. The specific transduction of a subset of cells in a 
population may require the sorting of cells prior to transduction. The methods of manipulating 
. 15 infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNAtransfections, and 
performing alphavirus infections, are well known to those with ordinary skill in the art. 

Oligonucleotides derived from the transcription initiation site, e.g., between about positions -40 
and +10 from the start site, may also be employed to inhibit gene expression. Similarly, inhibition can 
be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it causes 

20 inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, 

transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have 
been described in the literature. (See, e.g., Gee, J.E. et al. (1994) in Huber, B.E. and B.L Carr, 
Molecular and Immunologic Approaches , Futura Publishing, Mt. Kisco NY, pp. 163-177.) A 
complementary sequence or antisense molecule may also be designed to block translation of mRNA by 

25 preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze 

30 endonucleolytic cleavage of sequences encoding TRICH. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 
corresponding to the region of the target gene containing the cleavage site, may be evaluated for 
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secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes of the invention may be prepared by 
5 any method known in the art for the synthesis of nucleic acid molecules. These include techniques for 
chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. 
Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA sequences 
encoding TRICH. Such DNA sequences may be incorporated into a wide variety of vectors with 
suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that 
10 synthesize complementary RNA, constitutively or inducibly, can be introduced into cell lines, cells, or 
tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3* ends 
of the molecule, or the use of phosphorothioate or T O-methyl rather than phosphodiesterase linkages 

15 within the backbone of the molecule. This concept is inherent in the production of PNAs and can be 
extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, and 
wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidine, 
guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases. 
An additional embodiment of the invention encompasses a method for screening for a 

20 compound which is effective in altering expression of a polynucleotide encoding TRICH. 

Compounds which may be effective in altering expression of a specific polynucleotide may include, 
but are not limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming 
oligonucleotides, transcription factors and other polypeptide transcriptional regulators, and non- 
macromolecular chemical entities which are capable of interacting with specific polynucleotide 

25 sequences. Effective compounds may alter polynucleotide expression by acting as either inhibitors or 
promoters of polynucleotide expression. Thus, in the treatment of disorders associated with increased 
TRICH expression or activity, a compound which specifically inhibits expression of the 
polynucleotide encoding TRICH may be therapeutically useful, and in the treament of disorders 
associated with decreased TRICH expression or activity, a compound which specifically promotes 

30 expression of the polynucleotide encoding TRICH may be therapeutically useful. 

At least one, and up to a plurality, of test compounds may be screened for effectiveness in 
altering expression of a specific polynucleotide. A test compound may be obtained by any method 
commonly known in the art, including chemical modification of a compound known to be effective in 
altering polynucleotide expression; selection from an existing, commercially- available or proprietary 
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library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
library of chemical compounds created combinatorially or randomly. A sample comprising a 
polynucleotide encoding TRICH is exposed to at least one test compound thus obtained. The sample 
5 may comprise, for example, an intact or permeabilized cell, or an in vitro cell-free or reconstituted 
biochemical system. Alterations in the expression of a polynucleotide encoding TRICH are assayed 
by any method commonly known in the art. Typically, the expression of a specific nucleotide is 
detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of the polynucleotide encoding TRICH. The amount of hybridization may be quantified, thus 

10 forming the basis for a comparison of the expression of the polynucleotide both with and without 

exposure to one or more test compounds. Detection of a change in the expression of a polynucleotide 
exposed to a test compound indicates that the test compound is effective in altering the expression of 
the polynucleotide. A screen for a compound effective in altering expression of a specific 
polynucleotide can be earned out, for example, using a Schizosaccharomvces pombe gene expression 

15 system (Atkins, D. et al. (1999) U.S. Patent No. 5,932,435; Arndt, G.M. et al. (2000) Nucleic Acids 
Res. 28:E15) or a human cell line such as HeLa cell (Clarke, M.L. et al. (2000) Biochem. Biophys. 
Res. Commun. 268:8-13). A particular embodiment of the present invention involves screening a 
combinatorial library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide 
nucleic acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide 

20 sequence (Braice, T.W. et al. (1997) U.S. Patent No. 5,686,242; Bruice, T.W. et al. (2000) U.S. 
Patent No. 6,022,691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable for 
use in vivo , in vitro , and ex vivo . For ex vivo therapy, vectors may be introduced into stem cells taken 
from the patient and clonally propagated for autologous transplant back into that same patient. 
25 Delivery by transfection, by liposome injections, or by polycationic amino polymers may be achieved 
using methods which are well known in the art. (See, e.g., Goldman, C.K. et al. (1997) Nat. 
Biotechnol. 15:462-466.) 

Any of the therapeutic methods described above may be applied to any subject in need of such 
therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
30 monkeys. 

An additional embodiment of the invention relates to the administration of a composition which 
generally comprises an active ingredient formulated with a pharmaceutically acceptable excipient. 
Excipients may include, for example, sugars, starches, celluloses, gums, and proteins. Various 
formulations are commonly known and are thoroughly discussed in the latest edition of Remington's 
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Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions may consist of TRICH, 
antibodies to TRICH, and mimetics, agonists, antagonists, or inhibitors of TRICH. 

The compositions utilized in this invention may be administered by any number of routes 
including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, 
5 intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, 
sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared in liquid or dry powder form. 
These compositions are generally aerosolized immediately prior to inhalation by the patient. In the case 
of small molecules (e.g. traditional low molecular weight organic drugs), aerosol delivery of fast-acting 

10 formulations is well-known in the art. In the case of macromolecules (e.g. largo: peptides and proteins), 
recent developments in the field of pulmonary delivery via the alveolar region of the lung have enabled 
the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton. J.S. et al„ U.S. 
Patent No. 5,997,848). Pulmonary delivery has the advantage of administration without needle 
injection, and obviates the need for potentially toxic penetration enhancers. 

15 Compositions suitable for use in the invention include compositions wherein the active 

ingredients are contained in an effective amount to achieve the intended purpose. The determination of 
an effective dose is well within the capability of those skilled in the art. 

Specialized forms of compositions may be prepared for direct intracellular delivery of 
macromolecules comprising TRICH or fragments thereof. For example, liposome preparations 

20 containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of the 
macromolecule. Alternatively, TRICH or a fragment thereof may be joined to a short cationic N- 
terminal portion from the HIV Tat- 1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:1569-1572). 

25 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, monkeys, 
or pigs. An animal model may also be used to determine the appropriate concentration range and route 
of administration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

30 A therapeutically effective dose refers to that amount of active ingredient, for example TRICH 

or fragments thereof, antibodies of TRICH, and agonists, antagonists or inhibitors of TRICH, which 
ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be determined by 
standard pharmaceutical procedures in cell cultures or with experimental animals, such as by 
calculating the ED 50 (the dose therapeutically effective in 50% of the population) or LD 50 (the dose 
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lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
therapeutic index, which can be expressed as the LD 5( /ED 50 ratio. Compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 
used to formulate a range of dosage for human use. The dosage contained in such compositions is 
5 preferably within a range of circulating concentrations that includes the ED 50 with little or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 
patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the subject 
requiring treatment. Dosage and administration are adjusted to provide sufficient levels of the active 

10 moiety or to maintain the desired effect. Factors which may be taken into account include the severity 
of the disease state, the general health of the subject, the age, weight, and gender of the subject, lime 
and frequency of administration, drug combination^), reaction sensitivities, and response to therapy. 
Long-acting compositions may be administered every 3 to 4 days, every week, or biweekly depending 
on the half-life and clearance rate of the particular formulation. 

15 Normal dosage amounts may vary from about 0.1 i^g to 100,000 //g, up to a total dose of 

about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells,, 

20 conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bind TRICH may be used for the 
diagnosis of disorders characterized by expression of TRICH, or in assays to monitor patients being 
treated with TRICH or agonists, antagonists, or inhibitors of TRICH. Antibodies useful for diagnostic 

25 purposes may be prepared in the same manner as described above for therapeutics. Diagnostic assays 
for TRICH include methods which utilize the antibody and a label to detect TRICH in human body 
fluids or in extracts of cells or tissues. The antibodies may be used with or without modification, and 
may be labeled by covalent or non-covalent attachment of a reporter molecule. A wide variety of 
reporter molecules, several of which are described above, are known in the art and may be used. 

30 A variety of protocols for measuring TRICH, including ELISAs, RIAs, and FACS, are known 

in the art and provide a basis for diagnosing altered or abnormal levels of TRICH expression. Normal 
or standard values for TRICH expression are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, for example, human subjects, with antibodies to TRICH tinder 
conditions suitable for complex formation. The amount of standard complex formation may be 
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quantitated by various methods, such as photometric means. Quantities of TRICH expressed in 
subject, control, and disease samples from biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucleotides encoding TRICH may be used for 
5 diagnostic purposes. The polynucleotides which may be used include oligonucleotide sequences, 

complementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect and 
quantify gene expression in biopsied tissues in which expression of TRICH may be correlated with 
disease. The diagnostic assay may be used to determine absence, presence, and excess expression of 
TRICH, and to monitor regulation of TRICH levels during therapeutic intervention. 

10 In one aspect, hybridization with PCR probes which are capable of detecting polynucleotide 

sequences, including genomic sequences, encoding TRICH or closely related molecules may be used to 
identify nucleic acid sequences which encode TRICH. The specificity of the probe, whether it is made 
from a highly specific region, e.g., the 5' regulatory region, or from a less specific region, e.g., a 
conserved motif, and the stringency of the hybridization or amplification will determine whether the 

15 probe identifies only naturally occurring sequences encoding TRICH, allelic variants, or related 
sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
sequence identity to any of the TRICH encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:28-54 or from 
20 genomic sequences including promoters, enhancers, and introns of the TRICH gene. 

Means for producing specific hybridization probes for DNAs encoding TRICH include the 
cloning of polynucleotide sequences encoding TRICH or TRICH derivatives into vectors for the 
production of mRNA probes. Such vectors are known in the art, are commercially available, and may 
be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
25 polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a variety 
of reporter groups, for example, by radionuclides such as 32 P or 35 S, or by enzymatic labels, such as 
alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

Polynucleotide sequences encoding TRICH may be used for the diagnosis of disorders 
associated with expression of TRICH. Examples of such disorders include, but are not limited to, a 
30 transport disorder such as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia, cystic fibrosis, 
Becker's muscular dystrophy, Bell's palsy, Charcot-Marie Tooth disease, diabetes mellitus, diabetes 
insipidus, diabetic neuropathy, Duchenne muscular dystrophy, hyperkalemic periodic paralysis, 
normokalemic periodic paralysis, Parkinson's disease, malignant hyperthermia, multidrug resistance, 
myasthenia gravis, myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheral 
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neuropathy, cerebral neoplasms, prostate cancer, cardiac disorders associated with transport, e.g., 
angina, bradyarrythmia, tachyarrythmia, hypertension, Long QT syndrome, myocarditis, 
cardiomyopathy, nemaline myopathy, centronuclear myopathy, lipid myopathy, mitochondrial 
myopathy, thyrotoxic myopathy, ethanol myopathy, dermatomyositis, inclusion body myositis, 
5 infectious myositis, polymyositis, neurological disorders associated with transport, e.g., Alzheimer's 
disease, amnesia, bipolar disorder, dementia, depression, epilepsy, Tourette's disorder, paranoid 
psychoses, and schizophrenia, and other disorders associated with transport, e.g., neurofibromatosis, 
postherpetic neuralgia, trigeminal neuropathy, sarcoidosis, sickle cell anemia, Wilson's disease, 
cataracts, infertility, pulmonary artery stenosis, sensorineural autosomal deafness, hyperglycemia, 

10 hypoglycemia, Grave's disease, goiter, Cushing's disease, Addison's disease, glucose-galactose 

malabsorption syndrome, hypercholesterolemia, adrenoleukodystrophy, Zellweger syndrome, Menkes 
disease, occipital horn syndrome, von Gierke disease, cystinuria, iminoglycinuria, Hartup disease, and 
Fanconi disease; a neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, 
cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, dementia, Parkinson's 

15 disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor neuron 
disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple 
sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural . 
empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral 
central nervous system disease, prion diseases including kuru, Creutzfeldt- Jakob disease, and 

20 Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases 
of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, 
encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central 
nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic 
nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other 

25 neuromuscular disorders, peripheral nervous system disorders, dermatomyositis and polymyositis, 
inherited, metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental 
disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), 
akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, 
postherpetic neuralgia, Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, 

30 and familial frontotemporal dementia; a muscle disorder such as cardiomyopathy, myocarditis, 
Duchenne's muscular dystrophy, Becker's muscular dystrophy, myotonic dystrophy, central core 
disease, nemaline myopathy, centronuclear myopathy, lipid myopathy, mitochondrial myopathy, 
infectious myositis, polymyositis, dermatomyositis, inclusion body myositis, thyrotoxic myopathy, 
ethanol myopathy, angina, anaphylactic shock, arrhythmias, asthma, cardiovascular shock, Cushing's 

35 syndrome, hypertension, hypoglycemia, myocardial infarction, migraine, pheochromocytoma, and 
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myopathies including encephalopathy, epilepsy, Kearns-Sayre syndrome, lactic acidosis, myoclonic 
disorder, ophthalmoplegia, and acid maltase deficiency (AMD, also known as Pompe's disease); and 
an immunological disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, 
adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, 
5 atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune 

polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact 
dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, 
episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic 
gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's 

10 thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, 
psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic 
anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative 
colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal 

15 circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma. The 
polynucleotide sequences encoding TRICH may be used in Southern or northern analysis, dot blot, or 
other membrane-based technologies; in PCR technologies; in dipstick, pin, and multiformat ELISA-like 
assays; and in microarrays utilizing fluids or tissues from patients to detect altered TRICH expression. 
Such qualitative or quantitative methods are well known in the art. 

20 In a particular aspect, the nucleotide sequences encoding TRICH may be useful in assays that 

detect the presence of associated disorders, particularly those mentioned above. The nucleotide 
sequences encoding TRICH may be labeled by standard methods and added to a fluid or tissue sample 
from a patient under conditions suitable for the formation of hybridization complexes. After a suitable 
incubation period, the sample is washed and the signal is quantified and compared with a standard 

25 value. If the amount of signal in the patient sample is significantly altered in comparison to a control 
sample then the presence of altered levels of nucleotide sequences encoding TRICH in the sample 
indicates the presence of the associated disorder. Such assays may also be used to evaluate the efficacy 
of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the 
treatment of an individual patient. 

30 In order to provide a basis for the diagnosis of a disorder associated with expression of TRICH, 

a normal or standard profile for expression is established. This may be accomplished by combining 
body fluids or cell extracts taken from normal subjects, either animal or human, with a sequence, or a 
fragment thereof, encoding TRICH, under conditions suitable for hybridization or amplification. 
Standard hybridization may be quantified by comparing the values obtained from normal subjects with 
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values from an experiment in which a known amount of a substantially purified polynucleotide is used. 
Standard values obtained in this manner may be compared with values obtained from samples from 
patients who are symptomatic for a disorder. Deviation from standard values is used to establish the 
presence of a disorder. 

5 Once the presence of a disorder is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject. The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
days to months. 

10 With respect to cancer, the presence of an abnormal amount of transcript (either under- or 

overexpressed) in biopsied tissue from an individual may indicate a predisposition for the development 
of the disease, or may provide a means for detecting the disease prior to the appearance of actual 
clinical symptoms. A more definitive diagnosis of this type may allow health professionals to employ 
preventative measures or aggressive treatment earlier thereby preventing the development or further 

15 progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the sequences encoding TRICH 
may involve the use of PCR. These oligomers may be chemically synthesized, generated enzymatically, 
or produced in vitro . Oligomers will preferably contain a fragment of a polynucleotide encoding 
TRICH, or a fragment of a polynucleotide complementary to the polynucleotide encoding TRICH, and 

20 will be employed under optimized conditions for identification of a specific gene or condition. 

Oligomers may also be employed under less stringent conditions for detection or quantification of 
closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from the polynucleotide sequences 
encoding TRICH may be used to detect single nucleotide polymorphisms (SNPs). SNPs are 

25 substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease 
in humans. Methods of SNP detection include, but are not limited to, single-stranded conformation 
polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers 
derived from the polynucleotide sequences encoding TRICH are used to amplify DNA using the 
polymerase chain reaction (PCR). The DNA may be derived, for example, from diseased or normal 

30 tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause differences in the secondary 
and tertiary structures of PCR products in single-stranded form, and these differences are detectable 
using gel electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are 
fluorescently labeled, which allows detection of the amplimers in high-throughput equipment such as 
DNA sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP 
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(isSNP), are capable of identifying polymorphisms by comparing the sequence of individual 
overlapping DNA fragments which assemble into a common consensus sequence. These computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 
errors using statistical models and automated analyses of DNA sequence chromatograms. In the 
5 alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the high 
throughput MASSARRAY system (Sequenom, Inc., San Diego CA). 

Methods which may also be used to quantify the expression of TRICH include radiolabeling or 
biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 
standard curves. (See, e.g., Melby, P.C. et al. (1993) J. Immunol. Methods 159:235-244; Duplaa, C. et 
10 al. (1993) Anal. Biochem. 212:229-236.) The speed of quantitation of multiple samples may be 

accelerated by running the assay in a high-throughput format where the oligomer or polynucleotide of 
interest is presented in various dilutions and a spectrophotometric or colorimetric response gives rapid 
quantitation. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 
15 polynucleotide sequences described herein may be used as elements on a microarray. The microarray 
can be used in transcript imaging techniques which monitor the relative expression levels of large 
numbers of genes simultaneously as described below. The microarray may also be used to identify 
genetic variants, mutations, and polymorphisms. This information may be used to determine gene 
function, to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
20 progression/regression of disease as a function of gene expression, and to develop and monitor the 

activities of therapeutic agents in the treatment of disease. In particular, this information may be used 
to develop a pharmacogenomic profile of a patient in order to select the most appropriate and effective 
treatment regimen for that patient. For example, therapeutic agents which are highly effective and 
display the fewest side effects may be selected for a patient based on his/her pharmacogenomic profile. 
25 In another embodiment, TRICH, fragments of TRICH, or antibodies specific for TRICH may 

be used as elements on a microarray. The microarray may be used to monitor or measure protein- 
protein interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 
generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of 
30 gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
quantifying the number of expressed genes and their relative abundance under given conditions and at a 
given time. (See Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent Number 
5,840,484, expressly incorporated by reference herein.) Thus a transcript image may be generated by 
hybridizing the polynucleotides of the present invention or their complements to the totality of 
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transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
hybridization takes place in high-throughput format, wherein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript image would provide a profile of gene activity. 
5 Transcript images may be generated using transcripts isolated from tissues, cell lines, biopsies, 

or other biological samples. The transcript image may thus reflect gene expression in vivo , as in the 
case of a tissue or biopsy sample, or in vitro , as in the case of a cell line. 

Transcript images which profile the expression of the polynucleotides of the present invention 
may also be used in conjunction with in vitro model systems and preclinical evaluation of 

10 pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental 
compounds. All compounds induce characteristic gene expression patterns, frequently termed 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and toxicity 
(Nuwaysir, E.F. et al. (1999) Mol. Carcinog. 24:153-159; Steiner, S. and N.L. Anderson (2000) 
Toxicol. Lett. 1 12-1 1 3:467-471, expressly incorporated by reference herein). If a test compound has a 

15 signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. 
These fingerprints or signatures are most useful and refined when they contain expression information 
from a large number of genes and gene families. Ideally, a genome- wide measurement of expression^ 
provides the highest quality signature. Even genes whose expression is not altered by any tested 
compounds are important as well, as the levels of expression of these genes are used to normalize the 

20 rest of the expression data. The normalization procedure is useful for comparison of expression data 
after treatment with different compounds. While the assignment of gene function to elements of a 
toxicant signature aids in interpretation of toxicity mechanisms, knowledge of gene function is not 
necessary for the statistical matching of signatures which leads to prediction of toxicity. (See, for 
example, Press Release 00-02 from the National Institute of Environmental Health Sciences, released 

25 February 29, 2000, available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is 
important and desirable in toxicological screening using toxicant signatures to include all expressed 
gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological sample 
containing nucleic acids with the test compound. Nucleic acids that are expressed in the treated 
30 biological sample are hybridized with one or more probes specific to the polynucleotides of the 
present invention, so that transcript levels corresponding to the polynucleotides of the present 
invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences in the transcript levels between the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 



62 



NSDOCID: <WO 0146258A2J_> 



WO 01/46258 



PCT/US00/35095 



Another particular embodiment relates to the use of the polypeptide sequences of the present 
invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global 
pattern of protein expression in a particular tissue or cell type. Each protein component of a proteome 
can be subjected individually to further analysis. Proteome expression patterns, or profiles, are 
5 analyzed by quantifying the number of expressed proteins and their relative abundance under given 
conditions and at a given time. A profile of a cell's proteome may thus be generated by separating and 
analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is 
achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by 
isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl 

10 sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra) . The proteins are 
visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent 
such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot is 
generally proportional to the level of the protein in the sample. The optical densities of equivalently 
positioned protein spots from different samples, for example, from biological samples either treated or 

15 untreated with a test compound or therapeutic agent, are compared to identify any changes in protein 
spot density related to the treatment. The proteins in the spots are partially sequenced using, for 
example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry. 
The identity of the protein in a spot may be determined by comparing its partial sequence, preferably of 
at least 5 contiguous amino acid residues, to the polypeptide sequences of the present invention. In 

20 some cases, further sequence data may be obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for TRICH to quantify the 
levels of TRICH expression. In one embodiment, the antibodies are used as elements on a microarray, 
and protein expression levels are quantified by exposing the microarray to the sample and detecting the 
levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270: 103-1 1 1 ; 

25 Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed by a variety of 
methods known in the art, for example, by reacting the proteins in the sample with a thiol- or amino- 
reactive fluorescent compound and detecting the amount of fluorescence bound at each array element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and should 
be analyzed in parallel with toxicant signatures at the transcript level. There is a poor correlation 

30 between transcript and protein abundances for some proteins in some tissues (Anderson, N.L. and J. 
Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the 
analysis of compounds which do not significantly affect the transcript image, but which alter the 
proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid 
degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases. 
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In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins that are expressed in the treated biological 
sample are separated so that the amount of each protein can be quantified. The amount of each protein 
is compared to the amount of the corresponding protein in an untreated biological sample. A difference 
5 in the amount of protein between the two samples is indicative of a toxic response to the test compound 
in the treated sample. Individual proteins are identified by sequencing the amino acid residues of the 
individual proteins and comparing these partial sequences to the polypeptides of the present invention. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins from the biological sample are incubated 

10 with antibodies specific to the polypeptides of the present invention. The amount of protein recognized 
by the antibodies is quantified. The amount of protein in the treated biological sample is compared with 
the amount in an untreated biological sample. A difference in the amount of protein between the two 
samples is indicative of a toxic response to the test compound in the treated sample. 

Microarrays may be prepared, used, and analyzed using methods known in the art (See, e.g., 

15 Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. 
USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application W095/25 11 16; Shalon, D. et al. 
(1995) PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150- 
2155; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.) Various types of microarrays are well 
known and thoroughly described in DN A Microarrays : A Practical Approach , M. Schena, ed. (1999) 

20 Oxford University Press, London, hereby expressly incorporated by reference. 

In another embodiment of the invention, nucleic acid sequences encoding TRICH may be used 
to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Either 
coding or noncoding sequences may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 

25 of a multi-gene family may potentially cause undesired cross hybridization during chromosomal 
mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 
chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 
constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. et al. (1997) Nat. 

30 Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; andTrask, B.J. (1991) Trends Genet. 
7:149-154.) Once mapped, the nucleic acid sequences of the invention may be used to develop genetic 
linkage maps, for example, which correlate the inheritance of a disease state with the inheritance of a 
particular chromosome region or restriction fragment length polymorphism (RFLP). (See, for 
example, Lander, E.S. and D. Botstein (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357.) 
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Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic map 
data. (See, e.g., Heinz-Ulrich, et al. (1995) in Meyers, supra, pp. 965-968.) Examples of genetic map 
data can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) 
World Wide Web site. Correlation between the location of the gene encoding TRICH on a physical 
5 map and a specific disorder, or a predisposition to a specific disorder, may help define the region of 
DNA associated with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as 
linkage analysis using established chromosomal markers, may be used for extending genetic maps. 
Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may 

10 reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely localized 
by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to 1 lq22-23, any sequences 
mapping to that area may represent associated or regulatory genes for further investigation. (See, e.g., 

15 Gatti, R. A. et al. (1988) Nature 336:577-580.) The nucleotide sequence of the instant invention may 
also be used to detect differences in the chromosomal location due to translocation, inversion, etc., 
among normal, carrier, or affected individuals. 

In another embodiment of the invention, TRICH, its catalytic or immunogenic fragments, or 
oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug 

20 screening techniques. The fragment employed in such screening may be free in solution, affixed to a 
solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 
between TRICH and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest. (See, e.g., Geysen, et al. (1984) PCT 

25 application WO84/03564.) In this method, large numbers of different small test compounds are 

synthesized on a solid substrate. The test compounds are reacted with TRICH, or fragments thereof, 
and washed. Bound TRICH is then detected by methods well known in the art. Purified TRICH can 
also be coated directly onto plates for use in the aforementioned drug screening techniques. 
Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize it on a 

30 solid support. 

In another embodiment, one may use competitive drug screening assays in which neutralizing 
antibodies capable of binding TRICH specifically compete with a test compound for binding TRICH. 
In this manner, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with TRICH. 
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In additional embodiments, the nucleotide sequences which encode TRICH may be used in any 
molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are currently known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair interactions. 
5 Without further elaboration, it is believed that one skilled in the art can, using the preceding 

description, utilize the present invention to its fullest extent The following embodiments are, 
therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure 
in any way whatsoever. 

The disclosures of all patents, applications and publications, mentioned above and below, in 
10 particular U.S. Ser. No. 60/172,000, U.S. Ser. No. 60/176,083, U.S. Ser. No. 60/177,332, U.S. Ser. 
No. 60/178,572, U.S. Ser. No. 60/179,758, and U.S. Ser. No. 60/181,625, are expressly incorporated 
by reference herein. 

EXAMPLES 

15 I. Construction of cDNA Libraries 

Incyte cDNAs were derived from cDNA libraries described in the LIFESEQ GOLD database 
(Incyte Genomics, Palo Alto CA) and shown in Table 4, column 5. Some tissues were homogenized- 
and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a 
suitable mixture of denaturants, such as TRIZOL (Life Technologies), a monophasic solution of phenol 

20 and guanidine isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or extracted 
with chloroform. RNA was precipitated from the lysates with either isopropanol or sodium acetate and 
ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 
purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was isolated 

25 using oligo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, 
Chatsworth CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was 
isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA 
purification kit (Ambion, Austin TX). 

In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 

30 libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector system (Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), using the 
recommended procedures or similar methods known in the art. (See, e.g., Ausubel, 1997, supra , units 
5.1-6.6.) Reverse transcription was initiated using oligo d(T) or random primers. Synthetic 
oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the 
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appropriate restriction enzyme or enzymes. For most libraries, the cDNA was size-selected (300-1000 
bp) using SEPHACRYL S1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column 
chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis. cDNAs 
were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., 
PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid (Life Technologies), PCDNA2.1 plasmid 
(Invitrogen, Carlsbad CA), or pINCY (Incyte Genomics, Palo Alto CA). Recombinant plasmids were 
transformed into competent E. coli cells including XLl-Blue, XLl-BlueMRF, or SOLR from 
Stratagene or DH5a, DH10B, or ElectroMAX DH10B from Life Technologies. 

II. Isolation of cDNA Clones 

Plasmids obtained as described in Example I were recovered from host cells by in vivo excision 
using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using at least 
one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an AGTC 
Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, QIAWELL 
8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E. A.L. PREP 96 plasmid 
purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0.1 ml of 
distilled water and stored, with or without lyophilization, at 4°C. 

Alternatively, plasmid DNA was amplified from host cell lysates using direct link PCR in a 
high-throughput format (Rao, V.B. (1994) Anal. Biochem. 216:1-14). Host cell lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 384- 
well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically using 
PICOGREEN dye (Molecular Probes, Eugene OR) and a FLUOROSKAN II fluorescence scanner 
(Labsystems Oy, Helsinki, Finland). 

III. Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example II were sequenced as follows. 
Sequencing reactions were processed using standard methods or high-throughput instrumentation 
such as the ABI CATALYST 800 (PE Biosystems) thermal cycler or the PTC-200 thermal cycler (MJ 
Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the MICROLAB 
2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared using reagents 
provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as the ABI 
PRISM BIGDYE Terminator cycle sequencing ready reaction kit (PE Biosystems). Electrophoretic 
separation of cDNA sequencing reactions and detection of labeled polynucleotides wore carried out 
using the MEGAB ACE 1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or 
377 sequencing system (PE Biosystems) in conjunction with standard ABI protocols and base calling 
software; or other sequence analysis systems known in the art. Reading frames within the cDN A 



67 



WO 01/46258 



PCT/US00/35095 



sequences were identified using standard methods (reviewed in Ausubel, 1997, supra , unit 7.7). Some 
of the cDNA sequences were selected for extension using the techniques disclosed in Example VIII. 

The polynucleotide sequences derived from Incyte cDNAs were validated by removing vector, 
linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and programs based 
5 on BLAST, dynamic programming, and dinucleotide nearest neighbor analysis. The Incyte cDNA 
sequences or translations thereof were then queried against a selection of public databases such as the 
GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS, 
DOMO, PRODOM, and hidden Markov model (HMM)-based protein family databases such as PFAM. 
(HMM is a probabilistic approach which analyzes consensus primary structures of gene families. 

10 See, for example, Eddy, S.R. (1996) Curr. Opin. Struct Biol. 6:361-365.) The queries were 

performed usmg programs based on BLAST, FASTA, BLIMPS, and HMMR. The Incyte cDNA 
sequences were assembled to produce full length polynucleotide sequences. Alternatively, GenBank 
cDNAs, GenBank ESTs, stitched sequences, stretched sequences, or Genscan-predicted coding 
sequences (see Examples IV and V) were used to extend Incyte cDNA assemblages to full length. 

15 Assembly was performed using programs based on Phred, Phrap, and Consed, and cDNA assemblages 
were screened for open reading frames using programs based on GeneMark, BLAST, and FASTA. 
The full length polynucleotide sequences were translated to derive the corresponding full length 
polypeptide sequences which were subsequently analyzed by querying against databases such as the 
GenBank protein databases (genpept), SwissProt, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, 

20 and hidden Markov model (HMM)-based protein family databases such as PFAM. Full length 
polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software 
Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Polynucleotide and 
polypeptide sequence alignments are generated using default parameters specified by the CLUSTAL 
algorithm as incorporated into the MEGALIGN multisequence alignment program (DNASTAR), which 

25 also calculates the percent identity between aligned sequences. 

Table 7 summarizes the tools, programs, and algorithms used for the analysis and assembly of 
Incyte cDNA and full length sequences and provides applicable descriptions, references, and threshold 
parameters. The first column of Table 7 shows the tools, programs, and algorithms used, the second 
column provides brief descriptions thereof, the third column presents appropriate references, all of 

30 which are incorporated by reference herein in their entirety, and the fourth column presents, where 

applicable, the scores, probability values, and other parameters used to evaluate the strength of a match 
between two sequences (the higher the score or the lower the probability value, the greater the identity 
between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide and 
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polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ ID 
NO:28-54. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization and 
amplification technologies are described in Table 4, column 4. 

IV. Identification and Editing of Coding Sequences from Genomic DNA 

5 Putative transporters and ion channels were initially identified by running the Genscan gene 

identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is a 
general-purpose gene identification program which analyzes genomic DNA sequences from a variety of 
organisms (See Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94, and Burge, C. and S. Karlin 
(1998) Curr. Opin. Struct. Biol. 8:346-354). The program concatenates predicted exons to form an 

10 assembled cDNA sequence extending from a methionine to a stop codon. The output, of Genscan is a 
FASTA database of polynucleotide and polypeptide sequences. The maximum range of sequence for 
Genscan to analyze at once was set to 30 kb. To determine which of these Genscan predicted cDNA 
sequences encode transporters and ion channels, the encoded polypeptides were analyzed by querying 
against PFAM models for transporters and ion channels. Potential transporters and ion channels were 

15 also identified by homology to Incyte cDNA sequences that had been annotated as transporters and ion 
channels. These selected Genscan-predicted sequences were then compared by BLAST analysis to the 
genpept and gbpri public databases. Where necessary, the Genscan-predicted sequences were then 
edited by comparison to the top BLAST hit from genpept to correct errors in the sequence predicted by 
Genscan, such as extra or omitted exons. BLAST analysis was also used to find any Incyte cDNA or 

20 public cDNA coverage of the Genscan-predicted sequences, thus providing evidence for transcription. 
When Incyte cDNA coverage was available, this information was used to correct or confirm the 
Genscan predicted sequence. Full length polynucleotide sequences were obtained by assembling 
Genscan-predicted coding sequences with Incyte cDNA sequences and/or public cDNA sequences using 
the assembly process described in Example III. Alternatively, full length polynucleotide sequences were 

25 derived entirely from edited or unedited Genscan-predicted coding sequences. 

V. Assembly of Genomic Sequence Data with cDNA Sequence Data 
"Stitched" Sequences 

Partial cDNA sequences were extended with exons predicted by the Genscan gene identification 
program described in Example IV. Partial cDNAs assembled as described in Example HI were mapped 
30 to genomic DNA and parsed into clusters containing related cDNAs and Genscan exon predictions from 
one or more genomic sequences. Each cluster was analyzed using an algorithm based on graph theory 
and dynamic programming to integrate cDNA and genomic information, generating possible splice 
variants that were subsequently confirmed, edited, or extended to create a full length sequence. 
Sequence intervals in which the entire length of the interval was present on more than one sequence in 
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the cluster were identified, and intervals thus identified were considered to be equivalent by transitivity. 
For example, if an interval was present on a cDNA and two genomic sequences, then all three intervals 
were considered to be equivalent- This process allows unrelated but consecutive genomic sequences to 
be brought together, bridged by cDNA sequence. Intervals thus identified were then "stitched" together 
5 by the stitching algorithm in the order that they appear along their parent sequences to generate the 
longest possible sequence, as well as sequence variants. Linkages between intervals which proceed 
along one type of parent sequence (cDNA to cDNA or genomic sequence to genomic sequence) were 
given preference over linkages which change parent type (cDNA to genomic sequence). The resultant 
stitched sequences were translated and compared by BLAST analysis to the genpept and gbpri public 

10 databases. Incorrect exons predicted by Genscan were corrected by comparison to the top BLAST hit 
from genpept. Sequences were further extended with additional cDNA sequences, or by inspection of 
genomic DNA, when necessary. 
"Stretched" Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 

15 analysis. First, partial cDNAs assembled as described in Example III were queried against public 

databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases using 
the BLAST program. The nearest GenBank protein homolog was then compared by BLAST analysis 
to either Incyte cDNA sequences or GenScan exon predicted sequences described in Example IV. A 
chimeric protein was generated by using the resultant high-scoring segment pairs (HSPs) to map the 

20 translated sequences onto the GenBank protein homolog. Insertions or deletions may occur in the 

chimeric protein with respect to the original GenBank protein homolog. The GenBank protein homolog, 
the chimeric protein, or both were used as probes to search for homologous genomic sequences from the 
public human genome databases. Partial DNA sequences were therefore "stretched" or extended by the 
addition of homologous genomic sequences. The resultant stretched sequences were examined to 

25 determine whether it contained a complete gene. 

VI. Chromosomal Mapping of TRICH Encoding Polynucleotides 

The sequences which were used to assemble SEQ ID NO:28-54 were compared with 
sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other 
implementations of the Smith- Waterman algorithm. Sequences from these databases that matched 

30 SEQ ID NO:28-54 were assembled into clusters of contiguous and overlapping sequences using 
assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for 
Genome Research (WIGR), and Genethon were used to determine if any of the clustered sequences 
had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
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of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, or human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
5 chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM 
distances are based on genetic markers mapped by G6n6thon which provide boundaries for radiation 
hybrid markers whose sequences were included in each of the clusters. Human genome maps and 
other resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 

10 (http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified 
disease genes map within or in proximity to the intervals indicated above. 
VII. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene 
and involves the hybridization of a labeled nucleotide sequence to a membrane on which RN As from a 

15 particular cell type or tissue have been bound. (See, e.g., Sambrook, supra , ch. 7; Ausubel (1995) 
supra , ch. 4 and 16.) 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is 
much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer 
20 search can be modified to determine whether any particular match is categorized as exact or similar. 
The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identity 

5 x minimum {length(Seq. 1), length(Seq. 2)} 

25 

The product score takes into account both the degree of similarity between two sequences and the length 
of the sequence match. The product score is a normalized value between 0 and 1 00, and is calculated 
as follows: the BLAST score is multiplied by the percent nucleotide identity and the product is divided 
by (5 times the length of the shorter of the two sequences). The BLAST score is calculated by 
30 assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and -4 for 
every mismatch. Two sequences may share more than one HSP (separated by gaps). If there is more 
than one HSP, then the pair with the highest BLAST score is used to calculate the product score. The 
product score represents a balance between fractional overlap and quality in a BLAST alignment. For 
example, a product score of 100 is produced only for 100% identity over the entire length of the shorter 
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of the two sequences being compared. A product score of 70 is produced either by 100% identity and 
70% overlap at one end, or by 88% identity and 100% overlap at the other. A product score of 50 is 
produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap. 
Alternatively, polynucleotide sequences encoding TRICH are analyzed with respect to the 
5 tissue sources from which they were derived. For example, some full length sequences are assembled, 
at least in part, with overlapping Incyte cDNA sequences (see Example III). Each cDNA sequence is 
derived from a cDNA library constructed from a human tissue. Each human tissue is classified into one 
of the following organ/tissue categories: cardiovascular system; connective tissue; digestive system; 
embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; 

10 hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory 

system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract The number of 
libraries in each category is counted and divided by the total number of libraries across all categories. 
Similarly, each human tissue is classified into one of the following disease/condition categories: cancer, 
cell line, developmental, inflammation, neurological, trauma, cardiovascular, pooled, and other, and the 

15 number of libraries in each category is counted and divided by the total number of libraries across all 
categories. The resulting percentages reflect the tissue- and disease-specific expression of cDNA 
encoding TRICH. cDNA sequences and cDNA library/tissue information are found in the LIFESEQ 
GOLD database (Incyte Genomics, Palo Alto CA). 
VIII. Extension of TRICH Encoding Polynucleotides 

20 Full length polynucleotide sequences were also produced by extension of an appropriate 

fragment of the full length molecule using oligonucleotide primers designed from this fragment. One 
primer was synthesized to initiate 5 ' extension of the known fragment, and the other primer was 
synthesized to initiate 3 ' extension of the known fragment. The initial primers were designed using 
OLIGO 4.06 software (National Biosciences), or another appropriate program, to be about 22 to 30 

25 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target sequence 
at temperatures of about 68 °C to about 72 °C. Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 

Selected human cDNA libraries were used to extend the sequence. If more than one extension 
was necessary or desired, additional or nested sets of primers were designed. 

30 High fidelity amplification was obtained by PCR using methods well known in the art. PCR 

was performed in 96-well plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg 2+ , (NH 4 ) 2 S0 4 , 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme 
(Life Technologies), and Pfu DNA polymerase (Stiratagene), with the following parameters for primer 
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pair PCI A and PCI B: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 
2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C. In the 
alternative, the parameters for primer pair T7 and SK+ were as follows: Step 1: 94°C, 3 min; Step 2: 
94°C, 15 sec; Step 3: 57°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 
5 Step 6: 68 °C, 5 min; Step 7: storage at 4°C. 

The concentration of DNA in each well was determined by dispensing 100 \xl PICOGREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 |il of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Costar, 
Acton MA), allowing the DNA to bind to the reagent. The plate was scanned in a Fluoroskan II 

10 (Labsy stems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 

concentration of DNA. A 5 jA to 10 /A aliquot of the reaction mixture was analyzed by electrophoresis 
on a 1 % agarose gel to determine which reactions were successful in extending the sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-well plates, 
digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 

15 sonicated or sheared prior to religation into pUC 1 8 vector (Amersham Pharmacia Biotech). For 

shotgun sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) agarose 
gels, fragments were excised, and agar digested with Agar ACE (Promega). Extended clones were 
religated using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham 
Pharmacia Biotech), treated with Pfii DNA polymerase (Stratagene) to fill-in restriction site overhangs, 

20 and transfected into competent E. coli cells. Transformed cells were selected on antibiotic-containing 
media, and individual colonies were picked and cultured overnight at 37 °C in 384- well plates in LB/2x 
carb liquid media. 

The cells were lysed, and DNA was amplified by PCR using Taq DNA polymerase (Amersham 
Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1 : 

25 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4 
repeated 29 times; Step 6: 72 °C, 5 min; Step 7: storage at 4°C. DNA was quantified by PICOGREEN 
reagent (Molecular Probes) as described above. Samples with low DNA recoveries were reamplified 
using the same conditions as described above. Samples were diluted with 20% dimethysulfoxide (1:2, 
v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC 

30 DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle 
sequencing ready reaction kit (PE Biosystems). 

In like manner, full length polynucleotide sequences are verified using the above procedure or 
are used to obtain 5' regulatory sequences using the above procedure along with oligonucleotides 
designed for such extension, and an appropriate genomic library. 
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IX, Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:28-54 are employed to screen cDNAs, genomic 
DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base pairs, is 
specifically described, essentially the same procedure is used with larger nucleotide fragments. 
5 Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 software (National 
Biosciences) and labeled by combining 50 pmol of each oligomer, 250 //Ci of [y- 32 P] adenosine 
triphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase (DuPont NEN, Boston 
MA). The labeled oligonucleotides are substantially purified using a SEPHADEX G-25 superfine size 
exclusion dextran bead column (Amersham Pharmacia Biotech). An aliquot containing 10 7 counts per 

10 minute of the labeled probe is used in a typical membrane-based hybridization analysis of human 

genomic DNA digested with one of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, Xba I, or 
Pvu II (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon 
membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 

15 hours at 40°C. To remove nonspecific signals, blots are sequentially washed at room temperature 
under conditions of up to, for example, 0. 1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and : 
compared. 

X. Microarrays 

20 The linkage or synthesis of array elements upon a microarray can be achieved utilizing 

photolithography, piezoelectric printing (ink-jet printing, See, e.g., Baldeschweiler, supra .), mechanical 
microspotting technologies, and derivatives thereof. The substrate in each of the aforementioned 
technologies should be uniform and solid with a non-porous surface (Schena (1999), supra ). Suggested 
substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a procedure 

25 analogous to a dot or slot blot may also be used to arrange and link elements to the surface of a 

substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical array may be 
produced using available methods and machines well known to those of ordinary skill in the art and may 
contain any appropriate number of elements. (See, e.g., Schena, M. et al. (1995) Science 270:467-470; 
Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat. Biotechnol. 

30 16:27-31.) 

Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may 
comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
selected using software well known in the art such as LASERGENE software (DNASTAR). The array 
elements are hybridized with polynucleotides in a biological sample. The polynucleotides in the 
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biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. 
After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scanner is used to detect hybridization at each array element. Alternatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 
5 complementarity and the relative abundance of each polynucleotide which hybridizes to an element on 
the microarray may be assessed. In one embodiment, microarray preparation and usage is described in 
detail below. 

Tissue or Cell Sample Preparation 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 
10 poly(A) + RNA is purified using the oligo-(dT) cellulose method. Each poly(A) + RNA sample is 

reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/pl oligo-(dT) primer (21mer), IX first 
strand buffer, 0.03 units/pl RNase inhibitor, 500 dATP, 500 nM dGTP, 500 dTTP, 40 pM 
dCTP, 40 pM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng poly(A) + RNA with 
15 GEMB RIGHT kits (Incyte). Specific control poly(A) + RNAs are synthesized by in vitro transcription 
from non-coding yeast genomic DNA. After incubation at 37° C for 2 hr, each reaction sample (one 
with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and 
incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA. Samples are purified 
using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. 
20 (CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 
then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and 
resuspended in 14 jjl 5X SSC/0.2% SDS. 
Microarray Preparation 

25 Sequences of the present invention are used to generate array elements. Each array element is 

amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR amplification uses 
primers complementary to the vector sequences flanking the cDNA insert Array elements are 
amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 
jug. Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia 

30 Biotech). 

Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
slides (Corning) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Coiporation (VWR), West Chester PA), washed extensively in distilled water, and 
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coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 110°C 
oven. 

Array elements are applied to the coated glass substrate using a procedure described in US 
Patent No. 5,807,522 , incorporated herein by reference. 1 |jl of the array element DNA, at an average 
5 concentration of 100 ng/pl, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 
Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
10 buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60° C followed by washes in 
0.2% SDS and distilled water as before. 
Hybridization 

Hybridization reactions contain 9 |Jl of sample mixture consisting of 0.2 jig each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 

15 mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and covered with 
an 1.8 cm 2 coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly 
larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 
140 jjI of 5X SSC in a corner of the chamber. The chamber containing the arrays is incubated for 
about 6.5 hours at 60° C. The arrays are washed for 10 min at 45° C in a first wash buffer (IX SSC, 

20 0.1% SDS), three times for 10 minutes each at 45°C in a second wash buffer (0.1X SSC), and dried. 
Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 

25 focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1 .8 cm x 1 .8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 

30 Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 

Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 
filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 

35 although the apparatus is capable of recording the spectra from both fluorophores simultaneously. 
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The sensitivity of the scans is typically calibrated using the signal intensity generated by a 
cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the intensity of the signal at that 
location to be correlated with a weight ratio of hybridizing species of 1 : 1 00,000. When two samples 
5 from different sources (e.g., representing test and control cells), each labeled with a different 

fluorophore, are hybridized to a single array for the purpose of identifying genes that are differentially 
expressed, the calibration is done by labeling samples of the calibrating cDNA with the two 
fluorophores and adding identical amounts of each to the hybridization mixture. 

The output of the photomultiplier tube is digitized using a 12-bit RH-835H analog-to-digital 
10 (A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-compatible PC 
computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping 
15 emission spectra) between the fluorophores using each fluorophore' s emission spectrum. 

A grid is superimposed over the fluorescence signal image such that the signal from each spot 
is centered in each element of the grid. The fluorescence signal within each element is then integrated 
to obtain a numerical value con*esponding to the average intensity of the signal. Hie software used 
for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 
20 XI. Complementary Polynucleotides 

Sequences complementary to the TRICH-encoding sequences, or any parts thereof, are used to 
detect, decrease, or inhibit expression of naturally occurring TRICH. Although use of oligonucleotides 
comprising from about 15 to 30 base pairs is described, essentially the same procedure is used with 
smaller or with larger sequence fragments. Appropriate oligonucleotides are designed using OLIGO 
25 4.06 software (National Biosciences) and the coding sequence of TRICH. To inhibit transcription, a 
complementary oligonucleotide is designed from the most unique 5 9 sequence and used to prevent 
promoter binding to the coding sequence. To inhibit translation, a complementary oligonucleotide is 
designed to prevent ribosomal binding to the TRICH-encoding transcript. 
XII. Expression of TRICH 
30 Expression and purification of TRICH is achieved using bacterial or virus-based expression 

systems. For expression of TRICH in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA 
transcription. Examples of such promoters include, but are not limited to, the trp-lac {tad) hybrid 
promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory 
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element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). 
Antibiotic resistant bacteria express TRICH upon induction with isopropyl beta-D- 
thiogalactopyranoside (IPTG). Expression of TRICH in eukaryotic cells is achieved by infecting insect 
or mammalian cell lines with recombinant Autographica californica nuclear polyhedrosis virus 
5 (AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDNA encoding TRICH by either homologous recombination or bacterial-mediated 
transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 
polyhedrin promote drives high levels of cDNA transcription. Recombinant baculovirus is used to 
infect Spodoptera frugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
10 Infection of the latter requires additional genetic modifications to baculovirus. (See Engelhard, E.K. et 
al. (1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. . 
7:1937-1945.) 

In most expression systems, TRICH is synthesized as a fusion protein with, e.g., glutathione S- 
transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 

15 affinity-based purification of recombinant lusion protein from crude cell lysates. GST, a 26-kilodalton 
enzyme from Schistosoma iaponicum , enables the purification of fusion proteins on immobilized 
glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia 
Biotech). Following purification, the GST moiety can be proteolytically cleaved from TRICH at 
specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity purification 

20 using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6~ 
His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins 
(QIAGEN). Methods for protein expression and purification are discussed in Ausubel (1995, supra , 
ch. 10 and 16). Purified TRICH obtained by these methods can be used directly in the assays shown in 
Examples XVI, XVII, and XVIII, where applicable. 

25 XIII. Functional Assays 

TRICH function is assessed by expressing the sequences encoding TRICH at physiologically 
elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 
include PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogen, Carlsbad CA), both of which 

30 contain the cytomegalovirus promoter. 5-10 //g of recombinant vector are transiently transfected into a 
human cell line, for example, an endothelial or hematopoietic cell line, using either liposome 
formulations or electroporation. 1-2 yug of an additional plasmid containing sequences encoding a 
marker protein are co-transfected. Expression of a marker protein provides a means to distinguish 
transfected cells from nontransfected cells and is a reliable predictor of cDNA expression from the 
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recombinant vector. Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; 
Clontech), CD64, or a CD64-GFP fusion protein. Row cytometry (FCM), an automated, laser optics- 
based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the 
apoptotic state of the cells and other cellular properties. FCM detects and quantifies the uptake of 
5 fluorescent molecules that diagnose events preceding or coincident with cell death. These events include 
changes in nuclear DNA content as measured by staining of DNA with propidium iodide; changes in 
cell size and granularity as measured by forward light scatter and 90 degree side light scatter; down- 
regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; alterations in 
expression of cell surface and intracellular proteins as measured by reactivity with specific antibodies; 

10 and alterations in plasma membrane composition as measured by the binding of fluorescein-conjugated 
Annexin V protein to the cell surface. Methods in flow cytometry are discussed in Ormerod, M.G. 
(1994) Row Cytometry . Oxford, New York NY. 

The influence of TRICH on gene expression can be assessed using highly purified populations 
of cells transfected with sequences encoding TRICH and either CD64 or CD64-GFP. CD64 and 

15 CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 
immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success NY). 
mRNA can be purified from the cells using methods well known by those of skill in the art. Expression 
of mRNA encoding TRICH and other genes of interest can be analyzed by northern analysis or 

20 microarray techniques. 

XIV. Production of TRICH Specific Antibodies 

TRICH substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

25 Alternatively, the TRICH amino acid sequence is analyzed using LASERGENE software 

(DN ASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art. (See, e.g., Ausubel, 1995, supra , ch. 11.) 

30 Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431 A 

peptide synthesizer (PE Biosystems) using FMOC chemistry and coupled to KLH (Sigma- Aldrich, St. 
Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase 
immunogenicity. (See, e.g., Ausubel, 1995, supra .) Rabbits are immunized with the oligopeptide-KLH 
complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide and anti-TRICH 
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activity by, for example, binding the peptide or TRICH to a substrate, blocking with 1% BSA, reacting 
with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. 

XV. Purification of Naturally Occurring TRICH Using Specific Antibodies 
Naturally occurring or recombinant TRICH is substantially purified by immunoaffinity 

5 chromatography using antibodies specific for TRICH. An immunoaffinity column is constructed by 
covalently coupling anti-TRICH antibody to an activated chromatographic resin, such as 
CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the resin is 
blocked and washed according to the manufacturer's instructions. 

Media containing TRICH are passed over the immunoaffinity column, and the column is 
10 washed under conditions that allow the preferential absorbance of TRICH (e.g., high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions that disrupt 
antibody/TRICH binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and TRICH is collected. 

XVI. Identification of Molecules Which Interact with TRICH 

15 Molecules which interact with TRICH may include transporter substrates, agonists or 

antagonists, modulatory proteins such as Gfty proteins (Reimann, supra ) or proteins involved in TRICH 
localization or clustering such as MAGUKs (Craven, supra) . TRICH, or biologically active fragments 
thereof, are labeled with 125 I Bolton-Hunter reagent. (See, e.g., Bolton A.E. and W.M. Hunter (1973) 
Biochem. J. 133:529-539.) Candidate molecules previously arrayed in the wells of a multi-well plate 

20 are incubated with the labeled TRICH, washed, and any wells with labeled TRICH complex are 

assayed. Data obtained using different concentrations of TRICH are used to calculate values for the 
number, affinity, and association of TRICH with the candidate molecules. 

Alternatively, proteins that interact with TRICH are isolated using the yeast 2-hybrid system 
(Fields, S. and O. Song (1989) Nature 340:245-246). TRICH, or fragments thereof, are expressed as 

25 fusion proteins with the DN A binding domain of Gal4 or lexA and potential interacting proteins are 

expressed as fusion proteins with an activation domain. Interactions between the TRICH fusion protein 
and the reconstitutes a transactivation function that is observed by expression of a reporter gene. Yeast 
2-hybrid systems are commercially available, and methods for use of the yeast 2-hybrid system with ion 
channel proteins are discussed in Niethammer, M. and M. Sheng (1998, Meth. Enzymol. 293:104-122). 

30 TRICH may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 

which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. Patent 
No. 6,057,101). 

Potential TRICH agonists or antagonists may be tested for activation or inhibition of TRICH 
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ion channel activity using the assays described in section XVIII. 
XVII, Demonstration of TRICH Activity 

Ion channel activity of TRICH is demonstrated using an electrophysiological assay for ion 
conductance. TRICH can be expressed by transforming a mammalian cell line such as COS7, HeLa 
5 or CHO with a eukaryotic expression vector encoding TRICH. Eukaryotic expression vectors are 
commercially available, and the techniques to introduce them into cells are well known to those 
skilled in the art A second plasmid which expresses any one of a number of marker genes, such as B- 
galactosidase, is co-transformed into the cells to allow rapid identification of those cells which have 
taken up and expressed the foreign DNA. The cells are incubated for 48-72 hours after transformation 
10 under conditions appropriate for the cell line to allow expression and accumulation of TRICH and 6- 
galactosidase. 

Transformed cells expressing B-galactosidase are stained blue when a suitable colorimetric 
substrate is added to the culture media under conditions that are well known in the art. Stained cells 
are tested for differences in membrane conductance by electrophysiological techniques that are well 

15 known in the art. Untransformed cells, and/or cells transformed with either vector sequences alone or 
B-galactosidase sequences alone, are used as controls and tested in parallel. Cells expressing TRICH 
will have higher anion or cation conductance relative to control cells. The contribution of TRICH to 
conductance can be confirmed by incubating the cells using antibodies specific for TRICH. The 
antibodies will bind to the extracellular side of TRICH, thereby blocking the pore in the ion channel, 

20 and the associated conductance. 

Alternatively, ion channel activity of TRICH is measured as current flow across a TRICH- 
containing Xenopus laevis oocyte membrane using the two-electrode voltage-clamp technique (Ishi et 
al., supra ; Jegla, T. and L. Salkoff (1997) J. Neurosci. 17:32-44). TRICH is subcloned into an 
appropriate Xenopus oocyte expression vector, such as pBF, and 0.5-5 ng of mRNA is injected into 

25 mature stage IV oocytes. Injected oocytes are incubated at 18°C for 1-5 days. Inside-out 

macropatches are excised into an intracellular solution containing 116 mM K-gluconate, 4 mM KC1, 
and 10 mM Hepes (pH 7.2). The intracellular solution is supplemented with varying concentrations 
of the TRICH mediator, such as cAMP, cGMP, or Ca +2 (in the form of CaOLj), where appropriate. 
Electrode resistance is set at 2-5 Mfl and electrodes are filled with the intracellular solution lacking 

30 mediator. Experiments are performed at room temperature from a holding potential of 0 mV. Voltage 
ramps (2.5 s) from -100 to 100 mV are acquired at a sampling frequency of 500 Hz. Current measured 
is proportional to the activity of TRICH in the assay. 

Transport activity of TRICH is assayed by measuring uptake of labeled substrates into 
Xenopus laevis oocytes. Oocytes at stages V and VI are injected with TRICH mRNA (10 ng per 
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oocyte) and incubated for 3 days at 18°C in OR2 medium (82.5mM NaCl, 2.5 niM KC1, ImM CaCl 2 , 
ImM MgCl 2 , ImM Na 2 HP0 4 , 5 mM Hepes, 3.8 niM NaOH , 50|ig/ml gentamycin, pH 7.8) to allow 
expression of TRICH. Oocytes are then transferred to standard uptake medium (lOOmM NaCl, 2 mM 
KC1, ImM CaCl 2 , ImM MgCl 2 , 10 mM Hepes/Tris pH 7.5). Uptake of various substrates (e.g., 
5 amino acids, sugars, drugs, ions, and neurotransmitters) is initiated by adding labeled substrate (e.g. 
radiolabeled with 3 H, fluorescently labeled with rhodamine, etc.) to the oocytes. After incubating for 30 
minutes, uptake is terminated by washing the oocytes three times in Na + -free medium, measuring the 
incorporated label, and comparing with controls. TRICH activity is proportional to the level of 
internalized labeled substrate. 

10 ATPase activity associated with TRICH can be measured by hydrolysis of radiolabeled ATP- 

[y- 32 P], separation of the hydrolysis products by chromatographic methods, and quantitation of the 
recovered 32 P using a scintillation counter. The reaction mixture contains ATP-[y- 32 P] and varying 
amounts of TRICH in a suitable buffer incubated at 37 °C for a suitable period of time. The reaction is 
terminated by acid precipitation with trichloroacetic acid and then neutralized with base, and an aliquot 

15 of the reaction mixture is subjected to membrane or filter paper-based chromatography to separate the 
reaction products. The amount of 32 P liberated is counted in a scintillation counter. The amount of 
radioactivity recovered is proportional to the ATPase activity of TRICH in the assay. 
XVIII. Identification of TRICH Agonists and Antagonists 

TRICH is expressed in a eukaryotic cell line such as CHO (Chinese Hamster Ovary) or HEK 

20 (Human Embryonic Kidney) 293. Ion channel activity of the transformed cells is measured in the 

presence and absence of candidate agonists or antagonists. Ion channel activity is assayed using patch 
clamp methods well known in the art or as described in Example XVII. Alternatively, ion channel 
activity is assayed using fluorescent techniques that measure ion flux across the cell membrane 
(Velicelebi, G. et al. (1999) Meth. Enzymol. 294:20-47; West, M.R. and C.R. Molloy (1996) Anal. 

25 Biochem. 241 :5 1-58). These assays may be adapted for high-throughput screening using microplates. 
Changes in internal ion concentration are measured using fluorescent dyes such as the Ca 2+ indicator 
Fluo-4 AM, sodium-sensitive dyes such as SBFI and sodium green, or the CI" indicator MQAE (all 
available from Molecular Probes) in combination with the FLIPR fluorimetric plate reading system 
(Molecular Devices). In a more generic version of this assay, changes in membrane potential caused by 

30 ionic flux across the plasma membrane are measured using oxonyl dyes such as DiB AC 4 (Molecular 
Probes). DiBAC 4 equilibrates between the extracellular solution and cellular sites according to the 
cellular membrane potential. The dye's fluorescence intensity is 20-fold greater when bound to 
hydrophobic intracellular sites, allowing detection of DiBAC 4 entry into the cell (Gonzalez, J.E. and 
P.A. Negulescu (1998) Curr. Opin. Biotechnol. 9:624-631). Candidate agonists or antagonists may be 
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selected from known ion channel agonists or antagonists, peptide libraries, or combinatorial chemical 
libraries. 

Various modifications and variations of the described methods and systems of the invention will 
5 be apparent to those skilled in the art without departing from the scope and spirit of the invention. 
Although the invention has been described in connection with certain embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Indeed, various modifications of the described modes for carrying out the invention which are obvious 
to those skilled in molecular biology or related fields are intended to be within the scope of the following 
10 claims. 
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What is claimed is: 

1. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

5 a) an amino acid sequence selected from the group consisting of SEQ ID NO: 1-27, 

b) a naturally occurring amino acid sequence having at least 90% sequence identity to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-27, 

c) a biologically active fragment of an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-27, and 

10 d) an immunogenic fragment of an amino acid sequence selected from the group consisting 

of SEQ ID NO: 1-27. 

2. An isolated polypeptide of claim 1 selected from the group consisting of SEQ ID NO:l- 

27. 

15 

3. An isolated polynucleotide encoding a polypeptide of claim 1 . 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

20 5. An isolated polynucleotide of claim 4 selected from the group consisting of SEQ ID 

NO:28-54. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

25 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 

30 9. A method for producing a polypeptide of claim 1 , the method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein said 
cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide 
comprises a promoter sequence operably linked to a polynucleotide encoding the polypeptide of claim 
1, and 

35 b) recovering the polypeptide so expressed. 
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10. An isolated antibody which specifically binds to a polypeptide of claim 1. 

1 1 . An isolated polynucleotide comprising a polynucleotide sequence selected from the 
group consisting of: 

5 a) a polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, 

b) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:28-54, 

c) a polynucleotide sequence complementary to a), 

d) a polynucleotide sequence complementary to b), and 
10 e) an RNA equivalent of a)-d). 

12. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 11. 

15 13. A method for detecting a target polynucleotide in a sample, said target polynucleotide 

having a sequence of a polynucleotide of claim 11, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, and which probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization 

20 complex is formed between said probe and said target polynucleotide or fragments thereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 
present, the amount thereof. 

14. A method of claim 13, wherein the probe comprises at least 60 contiguous nucleotides. 

25 

15. A method for detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 1 1 , the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction 
amplification, and 

30 b) detecting the presence or absence of said amplified target polynucleotide or fragment 

thereof, and, optionally, if present, the amount thereof. 

16. A composition comprising an effective amount of a polypeptide of claim 1 and a 
pharmaceutically acceptable excipient. 

35 
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17. A composition of claim 16, wherein the polypeptide comprises an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-27. 

18. A method for treating a disease or condition associated with decreased expression of 
functional TRICH, comprising administering to a patient in need of such treatment the composition of 
claim 16. 

19. A method for screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

20. A composition comprising an agonist compound identified by a method of claim 19 and 
a pharmaceutically acceptable excipient 

21. A method for treating a disease or condition associated with decreased expression of 
functional TRICH, comprising administering to a patient in need of such treatment a composition of 
claim 20. 

22. A method for screening a compound for effectiveness as an antagonist of a polypeptide 
of claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting antagonist activity in the sample. 

23. A composition comprising an antagonist compound identified by a method of claim 22 
and a phannaceutically acceptable excipient 

24. .A method for treating a disease or condition associated with overexpression of functional 
TRICH, comprising administering to a patient in need of such treatment a composition of claim 23. 

25. A method of screening for a compound that specifically binds to the polypeptide of claim 
1, said method comprising the steps of: 

a) combining the polypeptide of claim 1 with at least one test compound under suitable 
conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compound, thereby identifying a 
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compound that specifically binds to the polypeptide of claim 1. 

26. A method of screening for a compound that modulates the activity of the polypeptide of 
claim 1 , said method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under conditions 
permissive for the activity of the polypeptide of claim 1, 

b) assessing the activity of the polypeptide of claim 1 in the presence of the test compound, and 

c) comparing the activity of the polypeptide of claim 1 in the presence of the test compound 
with the activity of the polypeptide of claim 1 in the absence of the test compound, wherein a change in 
the activity of the polypeptide of claim 1 in the presence of the test compound is indicative of a 
compound that modulates the activity of the polypeptide of claim 1. 

27. A method for screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5, the method 
comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, under conditions 
suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying amounts of 
the compound and in the absence of the compound. 

28. A method for assessing toxicity of a test compound, said method comprising: 

a) treating a biological sample containing nucleic acids with the test compound; 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at 
least 20 contiguous nucleotides of a polynucleotide of claim 1 1 under conditions whereby a specific 
hybridization complex is formed between said probe and a target polynucleotide in the biological 
sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide of claim 
1 1 or fragment thereof; 

c) quantifying the amount of hybridization complex; and 

d) comparing the amount of hybridization complex in the treated biological sample with the 
amount of hybridization complex in an untreated biological sample, wherein a difference in the 
amount of hybridization complex in the treated biological sample is indicative of toxicity of the test 
compound. 
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730 


Ser 


Pro 


Pro 


Lys 


Glu 
745 


Val 


Pro 


Val 


Gin 


Leu 
760 
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Ala Val Glu Arg Pro 
10 

Pro Gin Ser Pro Arg 
25 

Glu Leu Asn Val Gly 
40 

Leu Arg Lys Phe Pro 
55 

Leu Ala Lys Ala Ser 
70 

Arg Pro Ser Thr Tyr 
85 

Gly Gin Val Pro Thr 
100 

Gin Phe Tyr Glu He 
115 

Pro Gin He Phe Gly 
13 0 

Gin Val Pro Gly Tyr 
145 

Ala Arg Ala Glu Ala 
160 

Cys Leu Val Glu Thr 
175 

Leu Cys Phe Leu Gin 
190 
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Val 


Ala 


Leu 


Val 


Ser 










555 


Pro 


Thr 


Glv 


Pro 


Asn 










570 


Gin 


Glu 


Asp 


Glu 


Glv 










585 


Ala 


Ser 


Leu 


Glu 


Leu 










600 


Ala 


Phe 


Gin 


Glu 


Gin 










615 


Leu 


Leu 


Ala 


Tyr 

J: 


Val 










63 0 


Leu 


He 


Ala 


Leu 


Met 










645 


Ser 


Trp 


Ser 


He 


Trp 










660 


Met 


Glu 


Asn 


Glv 


Tvr 










675 


Val 


Met 


Leu 


Thr 


Val 










690 


Arg, Trp 


Cvs 


Phe 


Ara 










705 


Gin 


Thr 


Leu 


Pro 


Thr 










720 


Pro 


Arg 


Thr 


Leu 


Glu 










735 


Asp 


Glu 


Asp 


Gly 


Ala 










750 


Leu 


Gin 


Ser 


Asn 




Val 


Gly 


Ara 


Met 


Thr 










15 


Pro 


Arg 


Arg 


Pro 


Thr 










30 


Gly Glu 


Phe 


His 


Thr 










45 


Gly 


Ser 


Lys 


Leu 


Ala 










60 


Thr 


Asp 


Ala 


Glu 


Gly 










75 


Phe 


Arg 


Pro 


He 


Leu 










90 


Gin 


His 


He 


Pro 


Glu 










105 


Lys 


Pro 


Leu 


Val 


Lys 










120 


Glu 


Gin 


Val 


Ser 


Arg 










135 


Ser 


Glu 


Asn 


Leu 


Glu 










150 


He 


Thr 


Ala 


Arg 


Lys 










165 


Glu 


Glu 


Gin 


Asp 


Ala 










180 


Asp 


Lys 


Lys 


Met 


Phe 
10; 
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Lys 


Ser 


Val 


Val 


Lys 
200 


Phe Gly 


Pro 


Trr> 


Ser 


Asp 


Leu 


Met 


His 


Cys Leu 


Glu 


Met 








215 








Tyr 


Lys 


Val 


Phe 


Ser 
230 


Lys Phe 


Tyr 


Leu 


Asn 


Glu 


Phe 


His 


Phe 


Asn He 


Tyr 


Ser 










245 
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Met 


Ala 


Gly 


Arg 


Arg 


Leu 


Asn 


Leu 


Arg 


i 








5 










Cys 


Val 


Leu 


Leu 


Met 
20 


Ala 


Glu 


Thr 


Val 


Ser 


Thr 


Gly 


Ala 


His 
35 


He 


Ser 


Pro 


Gin 


Asn 


Gin 


Thr 


Pro 


Val 
50 


Val 


Asp 


Cys 


Arg 


Val 


Ser 


Asp 


Arg 


Cys 
65 


Asp 


Phe 


He 


Arg 


Ser 


Asp 


Gly 


Gly 


Tyr 
80 


Leu 


Asp 


Tyr 


Leu 


Phe 


Pro 


Pro 


Ser 


Leu 
95 


Leu 


Pro 


Leu 


Ala 


Trp 


Leu 


Leu 


Tyr 


Leu 
110 


Phe 


Leu 


He 


Leu 


Phe 


Phe 


Cys 


Pro 


Asn 
125 


Leu 


Ser 


Ala 


He 


Ser 


His 


Asn 


Val 


Ala 


Gly 


Val 


Thr 


Phe 










140 








Ala 


Pro 


Asp 


He 


Phe 
155 


Ser 


Ala 


Leu 


Val 


Thr 


Ala 


Gly 


Leu 


Ala 


Leu 


Gly 


Ala 


Leu 










170 








Val 


Thr 


Thr 


Val 


Val 
185 


Ala 


Gly 


Gly 


He 


Met 


Ala 


Ala 


Ser 


Arg 
200 


Pro 


Phe 


Phe 


Arg 


Val 


Ala 


Val 


Phe 


Leu 
215 


Thr 


Phe 


Leu 


Met 


Thr 


Leu 


Ala 


Trp 


Ala 

230 


Leu 


Gly 


Tyr 


Leu 


Val 


Val 


Thr 


Val 


He 
245 


Leu 


Cys 


Thr 


Trp 


Arg 


Gly 


Ser 


Leu 


Phe 
260 


Cys 


Pro 


Met 


Pro 


Ser 


Asp 


Ser 


Glu 


Glu 
275 


Asp 


Arg 


Val 


Ser 


Asp 


Tyr 


Gly 


Asp 


Glu 
290 


Tyr 


Arg 


Pro 


Leu 


Thr 


Ala 


Gin 


He 


Leu 
305 


Val 


Arg 


Ala 


Leu 


Lys 


Trp 


Arg 


Arg 


Lys 
320 


Ser 


Ala 


Tyr 


Trp 


Lys 


Leu 


Pro 


Val 


Glu 
335 


Phe 


Leu 


Leu 


Leu 


Asp 


Pro 


Asp 


Lys 


Asp 


Asp 


Gin 


Asn 


Trp 
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Lys 


Ala 


Val 


Leu 


Asp 


Asn 


205 










210 


Asp 


He 


Lys 


Ala 


Gin 


Gly 


220 










225 


Thr 


Tyr 


Pro 


Thr 


Lys 


Arg 


235 










240 


Phe 


Thr 


Phe 


Thr 


Trp 


Trp 


250 










255 


Tro 


Ala 


Leu 


Ser 


Val 


Leu 


10 










15 


Ser 


Gly Thr 


Arg 


Glv 


Ser 


25 










30 


Phe 


Pro 


Ala 


Ser 


Glv 


Val 


40 










45 


Lys 


Val 


Cys 


Gly 


Leu 


Asn 


55 










60 


Thr 


Asn 


Pro 


Asp 


Cys 


His 


70 










75 


Glu 


Gly 


He 


Phe 


Cys 


His 


85 










90 


Val 


Thr 


Leu 




Val 


Ser 


100 








105 


Glv 


Val 


Thr 


Ala 


Ala 


Lys 


115 










120 


Ser 


Thr 


Thr 


Leu 


Lys 


Leu 


130 










13 5 


Leu 


Ala 


Phe 


Gly 


Asn 


Glv 


145 










150 


Ala 


Phe 


Ser 


Asp 


Pro 


His 


160 








165 


Phe 


Gly Ala Gly 


Val 


Leu 


175 










180 


Thr 


He 


Leu 


His 


Pro 


Phe 


190 










195 


Asp 


He 


Val 


Phe 


Tvr 


Met 


205 










210 


Leu 


Phe 


Arg 


Gly 


Arg 


Val 


220 










225 


Glv 


Leu 


Tyr 


Val 


Phe 


Tvr 


235 










240 


He 


Tyr 


Gin 


Arg 


Gin 


Arg 


250 










255 


Val 


Thr 


Pro 


Glu 


He 


Leu 


265 










270 


Ser 


Asn 


Thr 


Asn 


Ser 


TVr 


280 










285 


Phe 


Phe 


Tyr 


Gin 


Glu 


Thr 


295 










300 


Asn 


Pro 


Leu Asp 


Tyr 


Met 


310 










315 


Lys 


Ala 


Leu 


Lys 


Val 


Phe 


325 










330 


Leu 


Thr 


Val 


Pro 


Val 


Val 


340 










345 


Lys 


Arg 


Pro 


Leu 


Asn 


Cys 
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Leu 


His 


Leu 


Val 


He 
365 


Ser 


Pro 


Leu 


Val 


Ser 


Gly 


Thr 


Tyr 


Gly 
380 


Val 


Tyr 


Glu 


He 


Trp 


Val 


Val 


Val 


Val 


He 


Ala 


Gly Thr 










395 










Phe 


Phe 


Ala 


Thr 


Ser 
410 


Asp 


Ser 


Gin 


Pro 


Phe 


Ala 


Phe 


Leu 


Gly 
425 


Phe 


Leu 


Thr 


Ser 


Ala 


Ala 


Thr 


Glu 


Val 
440 


Val 


Asn 


He 


Leu 


Phe 


Arcr 


Leu 


Ser 


Asn 


Thr 


Val 


Leu 


Gly 








455 










Gly 


Asn 


Ser 


He 


Gly 
470 


Asp 


Ala 


Phe 


Ser 


Gin 


Gly 


TVr 


Pro 


Arg 
485 


Met 


Ala 


Phe 


Ser 


lie 


Phe 


Asn 


He 


Leu 
500 


Val 


Gly 


Val 


Gly 


lie 


Ser 


Arg 


Ser 


His 


Thr 


Glu 


Val 


Lys 








515 










Leu 


Val 


Trp 


Val 


Leu 


Ala 


Gly 


Ala 


Leu 








530 








Ser 


Leu 


Val 


Ser 


Val 
545 


Pro 


Leu 


Gin 


Cys 


Tyr 


Gly 


Phe 


Cys 


Leu 
560 


Leu 


Leu 


Phe 


Tyr 


Ala 


Leu 


Leu 


He 


Glu 


Phe 


Gly 


Val 


He 
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<400> 6 

Met Lys Leu Ser Lys Lys Asp Arg Gly 

1 5 
Ser Ala Lys Lys Lys Leu Asp Trp Ser 
20 

Ser Leu Ala Gly Ala Phe Gly Ser Ser 
35 

Leu Ser Val Val Asn Ala Pro Thr Pro 

50 

Asn Glu Ser Trp Glu Arg Arg His Gly 

65 

Thr Leu Thr Leu Leu Trp Ser Val Thr 

80 

Gly Gly Leu Val Gly Thr Leu He Val 

95 

Leu Gly Arg Lys His Thr Leu Leu Ala 
110 

Ser Ala Ala Leu Leu Met Ala Cys Ser 
125 

Glu Met Leu He Val Gly Arg Phe He 
140 

Val Ala Leu Ser Val Leu Pro Met Tyr 
155 

Lys Glu He Arg Gly Ser Leu Gly Gin 
170 

Cys He Gly Val Phe Thr Gly Gin Leu 
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355 










360 


Val 


Val 


Leu 


Thr 


Leu 


Gin 


370 










375 


Gly 

St 


Gly 


Leu 


Val 


Pro 


Val 


385 
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Ala 


Leu 
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Ser 


Val 


Thr 


400 










405 


Pro 


Arcr 


Leu 


His 


Trp 


Leu 


415 










420 


Ala 


Leu 


Trp 


He 


Asn 


Ala 


430 










435 


Arg 


Ser 


Leu Gly Val 


Val 


445 










450 


Leu 


Thr 


Leu 


Leu 


Ala 


Trp 


460 










465 


Asp 


Phe 


Thr 


Leu 


Ala 


Arg 


475 










480 


Ala 


Cys 


Phe 


Gly Gly 


He 


490 










495 


Leu 


Gly Cys 


Leu 


Leu 


Gin 


505 










510 


Leu 


Glu 


Pro 


Asp 


Gly Leu 


520 










525 


Gly Leu 


Ser 


Leu 


Val 


Phe 


535 










540 


Phe 


Gin 


Leu 


Ser 


Arg 


Val 


550 










555 


Leu 


Asn 


Phe 


Leu 


Val 


Val 


565 










570 


His 


Leu 


Lys 


Ser 


Met 




580 













Glu 


Asp 


Glu 


Glu 


Ser 


Asp 


10 










15 


Cys 


Ser 


Leu 


Leu 


Val 


Ala 


25 










30 


Phe 


Leu 


Tyr 


Gly 


Tyr 


Asn 


40 










45 


Tyr 


He 


Lys 


Ala 


Phe 


Tyr 


55 










60 


Arg 


Pro 


He 


Asp 


Pro 


Asp 


70 










75 


Val 


Ser 


He 


Phe 


Ala 


He 


85 










90 


Lys 


Met 


He 


Gly 


Lys 


Val 


100 










105 


Asn 


Asn 


Gly 


Phe 


Ala 


He 


115 








120 


Leu Gin Ala Gly Ala 


Phe 


130 










135 


Met 


Gly 


He Asp Gly Gly 


145 










150 


Leu 


Ser 


Glu 


He 


Ser 


Pro 


160 










165 


Val 


Thr 


Ala 


He 


Phe 


He 


175 










180 


Leu 


Gly 


Leu 


Pro 


Glu 


Leu 
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185 190 195 



Leu 


GlV 


Lvs 


GlU 


Ser 


Thr 


Trp 


Pro 


Tyr 


Leu 


Phe 


Gly 


Val 


He 


Val 










200 










205 










210 


Val 


Pro 


Ala 


Val 


Val 


Gin 


Leu 


Leu 


Ser 


Leu 


Pro 


Phe 


Leu 


Pro 


Asp 










215 










220 










225 


Ser 


Pro 


Arg 


Tyr 


Leu 


Leu 


Leu 


Glu 


Lys 


His 


Asn 


Glu Ala Arg 


Ala 










230 










235 










240 


Val 


Lys 


Ala 


Phe 


Gin 


Thr 


Phe 


Leu 


Gly 


Lys 


Ala 


Asp Val 


Ser 


Gin 










245 










250 










255 


Glu 


Val 


Glu 


Glu 


Val 


Leu 


Ala 


Glu 


Ser 


His 


Val 


Gin 


Arg 


Ser 


He 










260 










265 










270 


Arg 


Leu 


Val 


Ser 


Val 


Leu 


Glu 


Leu 


Leu 


Arg 


Ala 


Pro 


Tyr 


Val 


Arg 










275 










280 










285 


Trp 


Gin 


Val 


Val 


Thr 


Val 


lie 


Val 


Thr 


Met 


Ala 


Cys 


Tyr 


Gin 


Leu 










290 










295 










300 


Cvs 

J: 


Gly 


Leu 


Asn 


Ala 


He 


Trp 


Phe 


Tyr 


Thr 


Asn 


Ser 


He 


Phe 


Gly 










305 










310 










315 


Lvs 


Ala 


Gly 


He 


Pro 


Leu 


Ala 


Lys 


He 


Pro 


Tvr 


Val 


Thr 


Leu 


Ser 










320 










325 










330 


Thr 


Gly 


Gly 


He 


Glu 


Thr 


Leu 


Ala 


Ala 


Val 


Phe 


Ser 


Gly Leu 


Val 










335 










340 










345 
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GlU 


His 


Leu 


Glv 


Arg Arg 


Pro 


Leu 


Leu 


He 


Gly Gly Phe 


Gly 










350 










355 










360 


Leu 


Met 


Gly Leu 


Phe 


Phe Gly Thr 


Leu 


Thr 


He 


Thr 


Leu 


Thr 


Leu 










365 










370 










375 


Gin 


Asp 


His 


Ala 


Pro 


Trp Val 


Pro 


Tvr 


Leu 


Ser 


He 


Val 


Gly 


He 










380 










385 










390 


Leu 


Ala 


He 


He 


Ala 


Ser 


Phe 


Cys 


Ser 


Glv 


Pro 


Ala 


Val 


Phe 


Pro 
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405 


Glu 


Glu 


Thr 


Val 


Asn 


Val 


Ser 


He 
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Met 


Thr 


GlU 


Lys 


Thr 


Asn 


Gly 


Val 


Lvs 


Ser 


Ser 


Pro 


Ala 


Asn 


Asn 


1 






5 








10 










15 


His 


Asn 


His 


His 


Ala 


Pro 


Pro 


Ala 


He 


Lys 


Ala 


Asn 


Gly 


Lys 


Asp 










20 










25 










30 


Asp 


His 


Arg 


Thr 


Ser 


Ser 


Arg 


Pro 


His 


Ser 


Ala 


Ala 


Asp 


Asp 


Asp 










35 










40 










45 


Thr 


Ser 


Ser 


Glu 


Leu 


Gin 


Arg 


Leu 


Ala 


Asp 


Val 


Asp 


Ala 


Pro 


Gin 










50 










55 










60 


Gin 


Gly 


Arg 


Ser 


Gly 


Phe 


Arg 


Arg 


He 


Val 


Arg 


Leu 


Val 


Gly 


He 










65 










70 










75 


He 


Arg 


Glu 


Trp 


Ala 


Asn 


Lys 


Asn 


Phe 


Arg 


Glu 


Glu 


Glu 


Pro 


Arg 










80 










85 










90 


Pro 


Asp 


Ser 


Phe 


Leu 


Glu 


Arg 


Phe 


Arg 


Gly 


Pro 


Glu 


Leu 


Gin 


Thr 










95 










100 










105 


Val 


Thr 


Thr 


Gin 


Glu 


Gly 


Asp 


Gly 


Lys 


Gly 


Asp 


Lys 


Asp 


Gly 


Glu 










110 










115 










120 


Asp 


Lys 


Gly 


Thr 


Lys 


Lys 


Lys 


Phe 


Glu 


Leu 


Phe 


Val 


Leu Asp 


Pro 










125 










130 










135 


Ala 


Gly 


Asp 


Trp 


Tyr 


Tyr 


Cys 


Trp 


Leu 


Phe 


Val 


He 


Ala 


Met 


Pro 










140 










145 










150 


Val 


Leu 


Tyr 


Asn 


Trp 


Cys 


Leu 


Leu 


Val 


Ala 


Arg 


Ala 


Cys 


Phe 


Ser 










155 










160 










165 


Asp 


Leu 


Gin 


Lys 


Gly 


Tyr 


Tyr 


Leu 


Val 


Trp 


Leu 


Val 


Leu Asp 


Tyr 










170 










175 










180 


Val 
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Asp 


Val 


Val 


Tyr 


He 


Ala 


Asp 


Leu 


Phe 


He 


Arg 


Leu 


Arg 
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Thr 


Gly 


Phe 


Leu 


Arg 


Asp 


Ala 


Ser 


He 


His 


Ser 


Pro 


Met 


Phe 


Glu 


Asn 


He 


Phe 


He 


His 


Trp 


Gly 


Phe 


Gly 


Glu 


Tyr 


Gly 


Ser 


Thr 


Leu 


Lys 


As P 


Glu 


Val 


Leu 


He 


Ser 


Asn 


Met 


Ala 


Val 


Lys 


Glu 


Ala 


Lys 


Lys 


Thr 


Val 


Leu 


Arg 


Ala 


Lys 


Val 


Arg 


Leu 


Val 


Leu 


He 


Cys 


Arg 


Glu 


Gly 


Lys 


Ala 


Leu 


Leu 


Asn 


He 


Lys 


Arg 


Ser 


Leu 


Leu 


Met 


Glu 


Glu 


Glu 


Arg 


Glu 


Asn 


Glu 


Leu 


Gly 


Gin 


Gly 


Arg 


Leu 


Gin 


Arg 
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Asp 


Asp 


Tyr 


Ala 


Asp 


Glu 
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Leu 


Glu 
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Asn 


Tvrr 


He 




215 
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Pro 


Thr 
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Glu 


Val 


Arg 
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Asp 
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Arg 
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Ser 
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Ala 
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Arg 
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He 
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He 
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He 
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He 
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Lys 
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Glu 
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Glu 


He 
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Arg 
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Arg 


Val 


Val 


Glu 


Gly 


Pro 


Ala 


Gin 


Ala 


Met 


Ser 


Arg 








260 








265 










270 


Gly Thr 


Met 


He 


Val 


Gly 


Ala 


Ala 


Thr 


Gly 


Gly 


He 


Leu 


Leu 


Leu 








275 










280 










285 


Leu Asp 


Val 


Val 


Ser 


Leu 


Ala 


Tvr 


Glu 


Ser 


Lvs 


His 


Leu 


Leu 


Glu 








290 








295 








300 


Gly Ala 


Lys 


Ser 


Glu 


Ser 


Ala 


Glu 


Glu 


Leu 


Lys 


Lys 


Arg 


Ala 


Gin 








3 05 










310 










315 


Glu Leu 


Glu Gly 


Lys 


Leu 


Asn 


Phe 


Leu 


Thr 


Lvs 


He 


His 


Glu 


Met 








320 










325 








330 


Leu Gin 


Pro 


Gly 


Gin 


Asp 


Gin 
























335 
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<400> 21 


























Met Ala 


Thr 


Trp 


Asp 


Glu 


Lys 


Ala 


Val 


Thr 


Arg 


Arg 


Ala 


Lvs 


Val 


1 






5 










10 










15 


Ala Pro 


Ala 


Glu 


Arg 


Met 


Ser 


Lys 


Phe 


Leu 


Arcr 


His 


Phe 


Thr 


Val 








20 










25 










30 


Val Gly 


Asp 


Asp 


Tyr 


His 


Ala 


Trp 


Asn 


He 


Asn 


Tvr 


LVS 


Lys 


Trp 








35 










40 










45 


Glu Asn 


Glu 


Glu 


Glu 


Glu 


Glu 


Glu 


Glu 


Glu 


Gin 


Pro 


Pro 


Pro 


Thr 








50 










55 










60 


Pro Val 


Ser 


Gly 


Glu 


Glu 


Gly 


Arg 


Ala 


Ala 


Ala 


Pro 


Asp 


Val 


Ala 








65 










70 










75 


Pro Ala 


Pro 


Gly 


Pro 


Ala 


Pro 


Arg 


Ala 


Pro 


Leu 


Asp 


Phe 


Arg 


Gly 








80 










85 










90 


Met Leu 


Arg 


Lys 


Leu 


Phe 


Ser 


Ser 


His 


Arg 


Phe 


Gin 


Val 


He 


He 








95 










100 










105 


lie Cys 


Leu 


Val 


Val 


Leu 


Asp 


Ala 


Leu 


Leu 


Val 


Leu 


Ala 


Glu 


Leu 








110 










115 










120 


lie Leu 


Asp 


Leu 


Lys 


He 


He 


Gin 


Pro 


Asp 


Lys 


Asn 


Asn 


Tyr 


Ala 








125 










130 










135 


Ala Met 


Val 


Phe 


His 


Tyr 


Met 


Ser 


He 


Thr 


He 


Leu 


Val 


Phe 


Phe 








140 










145 










150 


Met Met 


Glu 


He 


He 


Phe 


Lys 


Leu 


Phe 


Val 


Phe 


Arg 


Leu 


Glu 


Phe 








155 










160 










165 


Phe His 


His 


Lys 


Phe 


Glu 


He 


Leu 


Asp 


Ala 


Val 


Val 


Val 


Val 


Val 








170 










175 










180 


Ser Phe 


He 


Leu 


Asp 


He 


Val 


Leu 


Leu 


Phe 


Gin 


Glu 


His 


Gin 


Phe 








185 










190 










195 


Glu Ala 


Leu 


Gly 


Leu 


Leu 


He 


Leu 


Leu 


Arg 


Leu 


Trp 


Arg 


Val 


Ala 








200 










205 










210 


Arg lie 


He 


Asn 


Gly 


He 


He 


He 


Ser 


Val 


Lys 


Thr 


Arg 


Ser 


GlU 








215 










220 










225 


Arg Gin 


Leu 


Leu 


Arg 


Leu 


Lys 


Gin 


Met 


Asn 


Val 


Gin 


Leu 


Ala 


Ala 








230 










235 










240 


Lys lie 


Gin 


His 


Leu 


Glu 


Phe 


Ser 


Cys 


Ser 


Glu 


Lys 


Glu 


Gin 


Glu 








245 








250 








255 


He Glu 


Arg 


Leu 


Asn 


Lys 


Leu 


Leu 


Arg 


Gin 


His 


Gly 


Leu 


Leu 


Gly 
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Met 


Gly 


Gly 


Lys 


Gin 


1 








5 


Pro 


Val 


Lys 


Tyr 


Asp 
20 


Ser 


Cys 


Thr 


Asp 


Val 
35 


Leu 


Gly 


Tyr 


He 


Val 
50 


Pro 


Arg 


Gin 


Val 


Leu 
65 


Gly 


Met 


Gly 


Glu 


Asn 
80 


He 


Phe 


Ser 


Cys 


He 
95 


Asn 


Gly 


Leu 


Gin 


Cys 
110 


Pro 


Glu 


Asp 


Pro 


Trp 
125 


Val 


Gly 


Glu 


Val 


Phe 
140 


Gly 


Val 


Pro 


Trp 


Asn 
155 


Leu 


Cys 


Pro 


Ser 


Phe 
170 


Cys 


Phe 


Pro 


Trp 


Thr 
185 


Thr 


Asn 


Asp 


Thr 


Thr 
200 


Ser 


Leu 


Asn 


Ala 


Arg 
215 


Ala 


Gin 


Ser 


Trp 


Tyr 
230 


Val 


Leu 


Ser 


Leu 


Leu 
245 


Pro 


Leu 


Val 


Leu 


Val 
260 


Tyr 


Gly 


He 


Tyr 


Tyr 
275 


Lys 


Gly 


Ala 


Ser 


He 
290 


Ala 


Tyr 


Gin 


Ser 


Val 

305 


Leu 


Ala 


Val 


Leu 


Glu 
320 


Arg 


Gin 


Arg 


He 


Arg 
335 


Lys 


Ala 


Val 


Gly 


Gin 
350 


Thr 


Phe 


Val 


Leu 


Leu 
365 


Ala 


Leu 


Tyr 


Leu 


Ala 
380 


Ala 


Ser 


Asn 


He 


Ser 
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Arg 


Asp 


Glu 


Asp 


Asp 










10 


Pro 


Ser 


Phe 


Arg 


Glv 










25 


He 


Cys 


Cys 


Val 


Leu 










40 


Val 


Glv 


He 


Val 


Ala 








55 


Tvr 


Pro 


Arg Asn 


Ser 










70 


Lys 


Asp 


Lys 


Pro 


Tvr 










85 


Leu 


Ser 


Ser 


Asn 


He 










100 


Pro 


Thr 


Pro 


Gin 


Val 










115 


Thr 


Val 


Gly Lys 


Asn 










130 


Tvr 


Thr 


Lys 


Asn 


Arg 










145 


Met 


Thr 


Val 


He 


Thr 










160 


Leu 


Leu 


Pro 


Ser 


Ala 










175 


Asn 


He 


Thr 


Pro 


Pro 










190 


He 


Gin 


Gin Gly 


He 










205 


Asp 


He 


Ser 


Val 


Lys 










220 


Trp 


He 


Leu 


Val 


Ala 










235 


Phe 


He 


Leu 


Leu 


Leu 










250 


Leu 


He 


Leu 


Gly 


Val 










265 


Cys 


Trp 


Glu 


Glu 


Tyr 










280 


Ser 


Gin 


Leu 


Gly 


Phe 










295 


Gin 


Glu 


Thr 


Trp 


Leu 










310 


Ala 


He 


Leu 


Leu 


Leu 










325 


He 


Ala 


He 


Ala 


Leu 










340 


Met 


Met 


Ser 


Thr 


Met 










355 


Leu 


He 


Cys 


lie 


Ala 








370 


Thr 


Ser 


Gly Gin 


Pro 










385 


Ser 


Pro 


Gly Cys 


Glu 



Glu 


Ala 


Tyr 


Gly 


Lys 
15 


Pro 


He 


Lys 


Asn 


Arg 
30 


Phe 


Leu 


Leu 


Phe 


He 
45 


Trp 


Leu 


Tyr 


Gly 


Asp 
60 


Thr 


Gly 


Ala 


Tyr 


Cys 
75 


Leu 


Leu 


Tyr 


Phe 


Asn 
90 


He 


Ser 


Val 


Ala 


Glu 
105 


Cys 


Val 


Ser 


Ser 


Cys 
120 


Glu 


Phe 


Ser 


Gin 


Thr 

135 


Asn 


Phe 


Cys 


Leu 


Pro 
150 


Ser 


Leu 


Gin 


Gin 


Glu 
165 


Pro 


Ala 


Leu 


Gly 


Arg 
180 


Ala 


Leu 


Pro 


Gly 


He 
195 


Ser 


Gly 


Leu 


He 


Asp 
210 


He 


Phe 


Glu 


Asp 


Phe 
225 


Leu 


Gly 


Val 


Ala 


Leu 
240 


Arg 


Leu 


Val 


Ala 


Gly 
255 


Leu 


Gly 


Val 


Leu 


Ala 
270 


Arg 


Val 


Leu 


Arg 


Asp 
285 


Thr 


Thr 


Asn 


Leu 


Ser 
300 


Ala 


Ala 


Leu 


He 


Val 
315 


Val 


Leu 


He 


Phe 


Leu 
330 


Leu 


Lys 


Glu 


Ala 


Ser 
345 


Phe 


Tyr 


Pro 


Leu 


Val 
360 


Tyr 


Trp 


Ala 


Met 


Thr 
375 


Gin 


Tyr 


Val 


Leu 


Trp 
390 


Lys 


Val 


Pro 


He 


Asn 
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395 400 405 



Thr 


Ser 


Cys 


Asn 


Pro 


Thr 


Ala 


His 


Leu 


Val 


Asn 


Ser 


Ser 


Cys 


Pro 








410 










415 










420 


Gly 


Leu 


Met 


Cys 


Val 


Phe 


Gin 


Glv 
« j-y 


Tyr 


Ser 


Ser 


Lys 


Glv 


Leu 


He 










425 










430 










435 


Gin 


Arg 


Ser 


Val 


Phe 


Asn 


Leu 


Gin 


He 


Tyr 


Gly Val 


Leu 


Glv 


Leu 










440 










445 










450 


Phe 




Thr 


Leu 


Asn 


Trr> 


Val 


Leu 


Ala 


Leu 


Gly Gin 


Cvs 


Val 


Leu 








455 










460 










465 


Ala 


Gly 


Ala 


Phe 


Ala 


Ser 


Phe 


Tvr 

xyx 


Trp Ala 


rile 


His 


Lys 


Pro 


Gin 










470 










475 










480 


Asp 


lie 


Pro 


Thr 


Phe 


Pro 


Leu 


lie 


Ser 


Ala 


Phe 


He 


Arg 


Thr 


Leu 








485 










490 










495 


Arg 


TVr* 
± Y J - 


His 


Thr 


Glv 


Ser 


Leu 


Ala 


Phe 


Gly 


Ala 


Leu 


He 


Leu 


Thr 








500 










505 










510 


Leu 


Val 


Gin 


lie 


Ala 


Arg 


Val 


He 


Leu 


Glu 


Tyr 


He 


Asp 


His 


Lys 










515 










520 










525 


LeU 




Gly Val 


Gin 


Asn 


Pro 


Val 


Ala Arg 


Cys 


He 


Met 


Cys 


Cys 










530 










535 










540 


Phe 




Cys 


Cys 


Leu 


X X 


Cys 


Leu 


Glu 


Lys 


Phe 


He 


Lys 


Phe 


Leu 










545 










550 










555 


noil 


ni y 


Asn 


Ala 


xy x 


He 




lie 


Ala 


He 


Tyr 


Gly 


Lys 


Asn 


Phe 










560 










565 










570 


Cys 


Val 


Ser 


Ala 


Lys 


Asn 


Ala 


Phe 


Met 


Leu 


Leu 


Met 


Arg 


Asn 


He 








575 










580 










585 


V d X 


A ITT 

^ x jl y 


Val 


Val 


Val 


Leu 


Asp 


Lys 


Val 


Thr 


Asp 


Leu 


Leu 


Leu 


Phe 










590 










595 










600 


Phe 


Gly 


Lys 


Leu 


Leu 


Val 


Val 


Gly 


Gly Val 


Gly Val 


Leu 


Ser 


Phe 










605 










610 










615 


Phe 


Phe 


Phe 


Ser 


Gly 


Arg 


lie 


Pro 


Gly 


Leu 


Gly 


Lys 


Asp 


Phe 


Lys 










620 










625 










630 


Ser 


-IT -L ' 


His 


Leu 


Asn 


xyx 


xyx 


J-x 


Leu 


Pro 


He 


Met 


Thr 


Ser 


He 










635 










640 










645 


Leu. 


Gly 


Ala 


Tyr 


Val 


He 


Ala 


Ser 


Gly 


Phe 


Phe 


Ser 


Val 


Phe 


Glv 










650 










655 










660 


Met 


Cys 


Val 


Asp 


Thr 


Leu 


Phe 


Leu 


Cys 


Phe 


Leu 


Glu 


Asp 


Leu 


Glu 






665 










67 0 










675 


Arg 


Asn 


Asn Gly 


Ser 


Leu 


Asp 


Arg 


Pro 


Tyr 


Tyr 


Met 


Ser 


Lys 


Ser 










680 










685 










690 


Leu 


Leu 


Lys 


He 


Leu 


Gly 


Lys 


Lys 


Asn 


Glu 


Ala 


Pro 


Pro 


Asp 


Asn 










695 










700 










705 


Lys 


Lys 


Arg 


Lys 


Lys 
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Glu 


Gin 


Asn 


Phe 


Asp 


Gly 


Thr 


Ser 


Asp Glu 


Glu 


His 


Glu 


Gin 


Glu 


1 








5 






10 










15 


Leu 


Leu 


Pro 


Val 


Gin 


Lys 


His 


Tyr 


Gin Leu 


Asp 


Asp 


Gin 


Glu 


Gly 










20 








25 










30 


He 


Ser 


Phe 


Val 


Gin 


Thr 


Leu 


Met 


His Leu 


Leu 


Lys 


Gly Asn 


He 










35 








40 










45 


Gly 


Thr 


Gly 


Leu 


Leu 


Gly 


Leu 


Pro 


Leu Ala 


He 


Lys 


Asn 


Ala 


Gly 










50 








55 










60 


He 


Val 


Leu 


Gly 


Pro 


He 


Ser 


Leu 


Val Phe 


He 


Gly 


He 


He 


Ser 










65 








70 










75 


Val 


His 


Cys 


Met 


His 


He 


Leu 


Val 


Arg Cys 


Ser 


His 


Phe 


Leu 


Cys 










80 








85 










90 


Leu 


Arg 


Phe 


Lys 


Lys 


Ser 


Thr 


Leu 


Gly Tyr 


Ser 


Asp 


Thr 


Val 


Ser 
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95 



Phe 


Ala 


Met 


Glu 


Val 
110 


Ser 


Pro 


Trp 


Ala 


Trp 


Gly Arg 


Ser 


Val 


Val 


Asp 










125 








Leu 


Gly 


Phe 


Cys 


Ser 
140 


Val 


Tyr 


He 


Lys 


Gin 


Val 


His 


Glu 
155 


Gly 


Phe 


Leu 


Asn 


Ser 


Thr 


Asn 


Ser 
170 


Ser 


Asn 


Pro 


Leu 


Arg 


He 


Tyr 


Met 
185 


Leu 


Cys 


Phe 


Val 


Phe 


He 


Arg 


Glu 
200 


Leu 


Lys 


Asn 


Ala 


Asn 


Val 


Ser 


Met 
215 


Ala 


Val 


Ser 


Val 


Val 


Arg Asn 


Met 


Pro 


Asp 


Pro 










230 






Gly 


Trp 


Lys 


Lys 


Tyr 
245 


Pro 


Leu 


Phe 


Phe 


Glu 


Gly 


He 


Gly 
260 


Val 


Val 


Leu 


Glu 


Ser 


Lys 


Arg 


Phe 
275 


Pro 


Gin 


Ala 


Val 


Thr 


Thr 


Leu 


Tyr 
290 


Val 


Thr 


Leu 


Phe 


His 


Asp 


Glu 


He 
305 


Lvs 


Glv 


Ser 


Asp 


Val 


Trp 


Leu 


Tyr 
320 


Gin 


Ser 


Val 


He 


Phe 


Val 


Thr 


Tyr 
335 


Ser 


He 


Gin 


He 


He 


Pro 


Gly 


He 
350 


Thr 


Ser 


Lvs 


He 


Cys 


Glu 


Phe 


Gly 
365 


He 


Arg 


Ser 


Ala 


Gly 


Ala 


He 


Leu 
380 


He 


Pro 


Arg 


Val 


Gly 


Ala 


Val 


Ser 
395 


Ser 


Ser 


Thr 


Leu 


Val 


Glu 


He 


Leu 
410 


Thr 


Phe 


Ser 


Met 


Val 


Leu 


Lys 


Asn 
425 


He 


Ser 


He 


Phe 


Leu 


Leu Gly 


Thr 


Tyr 


He 


Thr 










440 






Thr 


Pro 


Lys 


Val 


Val 


Ala Gly Thr 










455 








Leu 


Asn 


Ser 


Thr 


Cys 


Leu 


Thr 


Ser 
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<400> 24 

Met Gly Leu Thr Phe He Asn Ala 

1 5 
Asn Thr Leu Arg Ala Leu Gly His 

20 

Leu Ala Gly Ala Ala Ala Gly Ala 





100 










105 


Ser 


Cys 


Leu 


Gin 


Lys 


Gin 


Ala 




115 










120 


Phe 


Phe 


Leu 


Val 


He 


Thr 


Gin 




130 










135 


Val 


Phe 


Leu 


Ala 


Glu 


Asn 


Val 




145 










150 


Glu 


Ser 


Lys 


Val 


Phe 


He 


Ser 




160 










165 


Cvs 


Glu 


Arg 


Arg 


Ser 


Val 


Asp 




175 










180 


Leu 


Pro 


Phe 


He 


He 


Leu 


Leu 




190 










195 


Leu 


Phe 


Val 


Leu 


Ser 


Phe 


Leu 




205 










210 


Leu 


Val 


He 


He 


x_y x 


Gin 


xy x 




220 










225 


His 


Asn 


Leu 


Pro 


He 


Val 


Ala 




235 










24 0 


Phe 


Glv 


Thr 


Ala 


Val 


Phe 


Ala 




250 










£i J 


Pro 


Leu 


Glu 


Asn 


Gin 


Met 


i-j_y o 




265 










270 


Leu 


Asn 


He 


Glv 


Met 


Glv 


He 




280 










O J> 


Ala 


Thr 


Leu 


Gly 


xy x 


Met 


Cys 




295 










3 00 


He 


Thr 


Leu 


Asn 


Leu 


Pro 


Gin 




310 










— > j — > 


Lys 


He 


Leu 




Ser 


Phe 


Gly 




325 










33 0 


Phe 


TVr 
±y x 


Val 


Pro 


Ala 


Glu 


X X fc= 




340 










345 


Phe 


His 


Thr 


Lys 




Lys 


Gin 




355 










3 60 


Phe 


Leu 


Val 


Ser 


He 


Thr 


Cys 




370 












Leu 


Asp 


He 


Val 


He 


Ser 


Phe 




385 










390 


Leu 


Ala 


Leu 


He 


Leu 


Pro 


Pro 




400 










405 


Lys 


Glu 


His 




Asn 


He 






415 










420 


Ala 


Phe 


Thr 


Glv 


Val 


Val 


Gly 




430 










435 


Val 


Glu 


Glu 


He 


He 


Tyr 


Pro 




445 










450 


Pro 


Gin 


Ser 


Pro 


Phe 


Leu 


Asn 




460 










465 


Gly 


Leu 


Lys 












475 












Leu 


Val 


Phe 


Gly 


Val 


Gin 


Gly 




10 










15 


Asp 


Ser 


Pro 


Leu 


Asn 


Gin 


Phe 




25 










30 


He 


Gin 


Cys 


Val 


He 


Cys 


Cys 
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Pro 


Met 


Glu 


Leu 


Ala 


Lys 


Thr 


Arg 


Leu 


Gin 


Leu 


Gin 


Asp 


Ala 


Glv 










50 










55 










60 


Pro 


Ala 


Arg 


Thr 


Tvr 


Lys 


Glv 


Ser 


Leu 


Asp 


Cvs 


Leu 


Ala 


Gin 


He 










65 










70 










75 


Tvr 


Glv 


His 


Glu 


Glv 


Leu 


Ara 


Glv 


Val 


Asn 


Ara 


Glv 


Met 


Val 


Ser 










80 










85 










90 


Thr 


Leu 


Leu 


Arc 

XX J_ 


Glu 


Thr 


Pro 


Ser 


Phe 


Gly 


Val 


Tvr 


Phe 


Leu 


Thr 










95 










100 










105 


Tvr 


Asp 


Ala 


Leu 


Thr 


Ara 


Ala 


Leu 


Gly 


Cys 


Glu 


Pro 


Glv 


Asn 


Ara 










110 










115 










120 


Leu 


Leu 


Val 


Pro 


LVS 


Leu 


Leu 


Leu 


Ala Gly 


Glv 


Thr 


Ser 


GlV 


He 










125 










130 










135 


Val 


Ser 


Trr> 


Leu 


Ser 


Thr 


Tvr 


Pro 


Val 


Asp 


Val 


Val 


Lvs 


Ser 


Arg 










140 










145 










150 


Leu 


Gin 


Ala 


Asp 


Glv 


Leu 


Arg 


Glv 


Ala 


Pro 


Arg 


Tvr 


Arg 


Glv 


He 










155 










160 










165 


Leu 


Asp 


Cvs 


Val 


His 


Gin 


Ser 


Tvr 


Arg 


Ala 


Glu 


Glv 


Tro 

xr 


Arg 


Val 










170 










175 










180 


Phe 


Thr 


Arg 


Gly 


Leu 


Ala 


Ser 


Thr 


Leu 


Leu 


Arg 


Ala 


Phe 


Pro 


Val 










185 










190 










195 


Asn 


Ala 


Ala 


Thr 


Phe 


Ala 


Thr 


Val 


Thr 


Val 


Val 


Leu 


Thr 




Ala 










200 










205 








210 


Arg 


Gly 


Glu 


Glu 


Ala 


Glv 


Pro 


Glu 


Gly Glu 


Ala 


Val 


Pro 


Ala 


Ala 










215 










22 0 










225 


Pro 


Ala 


Gly 


Pro 


Ala 


Leu 


Ala 


Gin 


Pro 


Ser 


Ser 


Leu 
















230 










235 












<210> 25 


























<211> 345 


























<212> PRT 


























<213> Homo sapiens 






















<220> 




























<221> misc_f eature 






















<223> Incyte ID 


No: 


3038193CD1 
















<400> 25 


























Met 


Arg 


Leu 


Leu 


Glu 


Arg 


Met 


Arg 


Lys 


Asp 


Tro 

XT 


Phe 


Met 


Val 


Glv 


1 








5 










10 










15 


He 


Val 


Leu 


Ala 


He 


Ala 


Gly 


Ala 


Lys 


Leu 


Glu 


Pro 


Ser 


He 


Glv 










20 










25 










30 


Val 


Asn Gly 


Gly 


Pro 


Leu 


Lys 


Pro 


Glu 


He 


Thr 


Val 


Ser 


Tvr 


He 










35 










40 










45 


Ala 


Val 


Ala 


Thr 


He 


Phe 


Phe 


Asn 


Ser 


Gly 


Leu 


Ser 


Leu 


Lys 


Thr 










50 










55 










60 


Glu 


Glu 


Leu 


Thr 


Ser 


Ala 


Leu 


Val 


His 


Leu 


Lys 


Leu 


His 


Leu 


Phe 










65 










70 








75 


He 


Gin 


He 


Phe 


Thr 


Leu 


Ala 


Phe 


Phe 


Pro 


Ala 


Thr 


He 


Trp 


Leu 










80 










85 










90 


Phe 


Leu 


Gin 


Leu 


Leu 


Ser 


He 


Thr 


Pro 


He 


Asn 


Glu 


Tro 

XT 


Leu 


Leu 










95 










100 










105 


Lys 


Gly Leu 


Gin 


Thr 


Val 


Gly 


Cys 


Met 


Pro 


Pro 


Pro 


Val 


Ser 


Ser 










110 










115 










120 


Ala 


Val 


He 


Leu 


Thr 


Lys 


Ala 


Val 


Gly 


Gly 


Asn 


Glu 


Gly 


He 


Val 










125 










13 0 










135 


He 


Thr 


Pro 


Leu 


Leu 


Leu 


Leu 


Leu 


Phe 


Leu 


Gly 


Ser 


Ser 


Ser 


Ser 










140 










145 










150 


Val 


Pro 


Phe 


Thr 


Ser 


He 


Phe 


Ser 


Gin 


Leu 


Phe 


Met 


Thr 


Val 


Val 










155 










160 










165 


Val 


Pro 


Leu 


He 


He 


Gly 


Gin 


He 


Val 


Arg 


Arg 


Tyr 


He 


Lys 


Asp 










170 










175 










180 


Trp 


Leu 


Glu 


Arg 


Lys 


Lys 


Pro 


Pro 


Phe 


Gly 


Ala 


He 


Ser 


Ser 


Ser 










185 










190 










195 


Val 


Leu 


Leu 


Met 


He 


He 


Tyr 


Thr 


Thr 


Phe 


Cys 


Asp 


Thr 


Phe 


Ser 










200 










205 










210 


Asn 


Pro 


Asn 


He 


Asp 


Leu 


Asp 


Lys 


Phe 


Ser 


Leu 


Val 


Leu 


He 


Leu 
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215 










£t \J 










225 


Phe lie 


He 


Phe 


Ser 


lie 


Gin 


Leu 


Ser 




Met 


XI tr U. 




J. J. IX 


Phe 








230 










235 










240 


lie Phe 


Ser 


Thr 


Arg 


Asn 


Asn 


Ser 


Gly 


Phe 


XXIX 


Pro 


Ala 


Asp 


Thr 








245 










250 










255 


Val Ala 


He 


He 


Phe 


Cys 


Ser 


Thr 


His 


Lys 


Ser 


Leu 


Thr 


Leu 


Glv 








260 










265 










270 


He Pro 


Met 


Leu 


Lys 


He 


Val 


Phe 


Ala 


Gly 


xy x 


Glu 


His 


Leu 


Ser 








275 










280 








285 


Leu. He 


Ser 


Val 


Pro 


Leu 


Leu 


He 


Tvr 


His 


Pro 


Ala 


Gin 


He 


Leu 








290 










295 










300 


T .oil fi] v 


Ser 


Val 


Leu 


Val 


Pro 


Thr 


He 


Lys 


Ser 


Tin 




V CX X 


Ser 








305 










310 








315 


Arg Gin 


Lys 


Lys 


Leu 


Leu 




XJXX 






irx w 


T.oi i 

Xj t:L4. 


ill CL 


nail 


T.<=n i 
XJfcr IX 








320 










Toe 

J J 










33 0 


Asn Asn 


Pro 


Glu 


Gly 


Leu 


Glu 




Leu 


Ser 


Tip 
IXC 


Lys 


rue 




His 








335 








340 










345 


<210> 26 


























<211> 521 


























<212> PRT 


























<213> Homo sapiens 






















<220> 




























<221> misc^f eature 






















<223> Incyte ID 


No : 


3460979CD1 
















<400> 26 


























Met Ala 


Ala 


Leu 


Ala 


Pro 


Val 


Gly 


Ser 


Pro 


Ala 


Ser 


Arg 


Gly 


Pro 


1 






5 










10 








15 


Arg Leu 


Ala 


Ala 


Gly 


Leu 


Arg 


Leu 


Leu 


Pro 


Met 


Leu 


Gly 


Leu 


Leu 








20 










25 










30 


Gin Leu 


Leu 


Ala 


Glu 


Pro 


Gly 


Leu 


Gly Arg 


Val 


His 


His 


Leu 


Ala 








35 










40 










45 


Leu Lys 


Asp Asp 


Val 


Arg 


His 


Lys 


Val 


His 


Leu 


Asn 


Thr 


Phe 


Gly 








50 










55 










60 


Phe Phe 


Lys 


Asp 


Gly 


Tyr 


Met 


Val 


Val 


Asn 


Val 


Ser 


Ser 


Leu 


Ser 








65 










70 










75 


Leu Asn 


Glu 


Pro 


Glu 


Asp 


Lys 


Asp 


Val 


Thr 


He 


Gly 


Phe 


Ser 


Leu 








80 










85 










90 


Asp Arg 


Thr 


Lys 


Asn 


Asp 


Gly 


Phe 


Ser 


Ser 


Tyr 


Leu Asp 


Glu 


Asp 








95 










100 










105 


Val Asn 


Tyr 


Cys 


He 


Leu 


Lys 


Lys 


Gin 


Ser 


Val 


Ser 


Val 


Thr 


Leu 








110 








115 










120 


Leu He 


Leu Asp 


He 


Ser 


Arg 


Ser 


Glu 


Val 


Arg Val 


Lys 


Ser 


Pro 








125 










130 










135 


Pro Glu 


Ala Gly 


Thr 


Gin 


Leu 


Pro 


Lys 


He 


He 


Phe 


Ser 


Arg 


Asp 








140 










145 








150 


Glu Lys 


Val 


Leu 


Gly 


Gin 


Ser 


Gin 


Glu 


Pro 


Asn 


Val 


Asn 


Pro 


Ala 








155 










160 










165 


Ser Ala Gly Asn 


Gin 


Thr 


Gin 


Lys 


Thr 


Gin 


Asp 


Gly Gly Lys 


Ser 








170 










175 










180 


Lys Arg 


Ser 


Thr 


Val 


Asp 


Ser 


Lys 


Ala 


Met 


Gly Glu Lys 


Ser 


Phe 








185 










190 










195 


Ser Val 


His 


Asn 


Asn 


Gly Gly Ala 


Val 


Ser 


Phe 


Gin 


Phe 


Phe 


Phe 








200 










205 










210 


Asn He 


Ser 


Thr 


Asp 


Asp 


Gin 


Glu 


Gly 


Leu 


Tyr 


Ser 


Leu 


Tyr 


Phe 








215 










220 










225 


His Lys 


Cys 


Leu 


Gly 


Lys 


Glu 


Leu 


Pro 


Ser 


Asp 


Lys 


Phe 


Thr 


Phe 








230 










235 










240 


Ser Leu 


Asp 


He 


Glu 


He 


Thr 


Glu 


Lys 


Asn 


Pro 


Asp 


Ser 


Tyr 


Leu 








245 










250 










255 


Ser Ala 


Gly 


Glu 


He 


Pro 


Leu 


Pro 


Lys 


Leu 


Tyr 


He 


Ser 


Met 


Ala 








260 










265 










270 


Phe Phe 


Phe 


Phe 


Leu 


Ser 


Gly 


Thr 


He 


Trp 


He 


His 


He 


Leu 


Arg 








275 










280 










285 
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Lys 


Arg 


Arg 


Asn 


Asp 

0 Q 0 
^. _? u 


Val 


Phe 


Lys 


Leu 


Pro 


Phe 


Thr 


Lys 

-j yj ~j 


Ser 


Leu 


Ser 


Tyr 


His 


Tyr 


He 


Ser 
320 


Ser 


Gin 


Gly 


Val 


Val 


Tyr 


Tyr 


He 
33 5 


Thr 


His 


Leu 


He 


Thr 


He 


Ala 


Leu 
350 


He 


Gly 


Thr 


He 


Leu 


Ser 


Asp 


Lys 
3 65 


Asp 


Lys 


Lys 


Leu 


Gin 


Val 


Leu 


Ala 
3 80 


Asn 


Val 


Ala 


Glu 






Thr 


X 11 J. 

■5Qt; 
~> j j 


Vj7 _L Li. 


Tyr 


Gly 


Leu 


Val 


Asp 


Leu 


Leu 

*± X \J 


Cys 


Cys 


Gly 


XX P 




He 




n±t> 

*A ^ 3 


-Li t: LI 


fill n 

^3 XiX 


1-3 X IX 




Al ^ 

J. CI. 


He 


A an 


T .oi 1 


Al Pi 


Xj_y o 


XJt= ix 










Add 






Val 


Leu 


He 


Val 


Cys 
^± j j 


Tyr 


He 


Tyr 


Leu 


Leu 


Lvs 


Leu 


Ala 


Val 


Pro 


Phe 








470 








Leu 


Leu 


Asp 


Glu 


Thr 


Ala 


Thr 


Leu 








485 








Tyr 


Lys 


Phe 


Arg 


Pro 
500 


Ala 


Ser 


Asp 


Gin 


Glu 


Glu 


Glu 


Asp 


Leu 


Glu 


Met 



515 

<210> 27 
<211> 555 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7472200CD1 

<400> 27 



Met 


Thr 


Leu 


Val 


Tyr 


Phe 


Pro 


Pro 


1 








5 








Gin 


Pro 


Ser 


Arg 


Ser 
20 


Ser 


Arg 


Leu 


Ser 


Trp 


Gin 


Leu 


Ala 


Leu Arg 


Phe 










35 








Gly 


Leu 


Asp 


Arg 


Leu 
50 


Leu 


Ser 


Ala 


Phe 


yal 


Trp 


Leu 


Cys 
65 


Thr 


Phe 


Val 


Tyr 


Val 


Cys 


Leu 


He 
80 


Leu 


Ser 


Ala 


Gin 


Thr 


Val 


Val 


Asp 
95 


Ser 


Thr 


Arg 


Phe 


Pro 


Val 


He 


Thr 
110 


He 


Cys 


Asn 


Arg 


Leu 


Ala 


Glu 


Ala 
125 


Lys 


Ser 


Arg 


Ser 


Ala 


Gin 


Gin 


Glu 
140 


Leu 


Phe 


Glu 


Asp 


Ala 


Tyr 


Phe 


Gly 
155 


His 


Phe 


Gin 


Gin 


Pro 


Thr 


Glu 


Leu 
17 0 


Leu 


Asn 


Tyr 



He 


His 


Trp 


Leu 


Met 


Ala 


Ala 




295 










3 00 


Leu 


Val 


Phe 


His 


Ala 


He 


Asp 




310 










315 


Phe 


Pro 


He Glu Gly Trp 


Ala 




325 










330 


Leu 


Lys 


Gly Ala Leu 


Leu 


Phe 




340 










345 


Glv 


Tro 


Ala 


Phe 


He 


Lys 


His 




355 










360 


He 


Phe 


Met 


He 


Val 


He 


Pro 




370 










375 


Tvr 

-"-XT J - 


He 


He 


He 


Glu 


Ser 


Thr 


385 










390 


Leu 


Trt> 


Lys 


Asp 


Ser 


Leu 


Phe 




400 










405 


Ala 


He 


Leu 


Phe 


Pro 


Val 


Val 




415 










420 


Ala 


Ser 


Ala Thr Asp 


x_y 


Lvs 




43 0 










435 


Lys 


Leu 


pne 


Arg 


rllS 


Tyr 


xy x 




445 










450 


Phe 


Thr 


Arg 


Tip 
lie 


lie 


Ala 


Phe 












465 


Gin 


xx p 


Lys 


Trp 


Xjfcr IX 


Tyr 


Gin 




475 










480 


Val 


Phe 


Phe 


Val 


Leu 


Thr 


Gly 




490 










495 


Asn 


Pro 


Tyr 


Leu 


Gin 


Leu 


Ser 




505 








510 


Glu 


Ser 


Val 












520 












Ser 


Lys 


Leu 


Gin 


Gin 


Gin 


Gin 




10 










15 


Ala 


Gin 


Gin 


Leu 


Ala 


Gin 


Ser 




25 










3 0 


Gly Lys 


Arg 


Thr 


Thr 


He 


His 




40 










45 


Lys 


Ala 


Ser Arg 


Trp 


GlU 


Arg 




55 










60 


Ser 


Ala 


Phe 


Leu 


Gly Ala 


Val 




70 










75 


Arg 


Tyr 


Asn 


Ala 


Ala 


His 


Phe 




85 










90 


Phe 


Pro 


Val 


Tyr 


Arg 


He 


Pro 




100 










105 


Arg Asn 


Arg 


Leu 


Asn 


Trp 


Gin 




115 










120 


Phe 


Leu 


Ala 


Asn 


Gly 


Ser 


Asn 




130 










135 


Leu 


He 


Val 


Gly 


Thr 


Tyr 


Asp 




145 










150 


Ser 


Phe 


Glu 


Arg 


Leu 


Arg 


Asn 




160 










165 


Val 


Asn 


Phe 


Ser 


Gin 


Val 


Val 




175 










180 
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Asp 


Phe 


Met 


Thr 


Trp 


Arg 


Cys 


Asn 


Glu 


Leu 


Leu 


Ala 


Glu 


Cys 


Leu 










185 










190 








195 


Trp 


Arg 


His 


His 


Ala 


Tyr Asp 


Cys 


Cys 


Glu 


He 


Arg 


Ser 


Lys 


Arg 










200 










205 










210 


Arg 


Ser 


Lys 


Asn 


Gly 


Leu 


Cys 


Trp 


Ala 


Phe 


Asn 


Ser 


Leu 


Glu 


Thr 


Glu 








215 










220 










225 


Glu 


Gly Arg 


Arg 


Met 


Gin 


Leu 


Leu 


Asp 


Pro 


Met 


Trp 


Pro 


Trp 










230 










235 








240 


Arg 


Thr 


Gly 


Ser 


Ala 


Gly Pro 


Met 


Ser 


Ala 


Leu 


Ser 


Val 


Arg 


Val 










245 










250 








255 


Leu 


He 


Gin 


Pro 


Ala 


Lvs 


His 


Trp 


Pro 


Gly 


His 


Arg 


Glu 


Thr 


Asn 










260 










265 










270 


Ala 


Met 


Lys 


Gly 


He 


Asp 


Val 


Met 


Val 


Thr 


Glu 


Pro 


Phe 


Val 


Trp 


His 








275 










280 










285 


Asn 


Asn 


Pro 


Phe 


Phe 


Val 


Ala 


Ala 


Asn 


Thr 


Glu 


Thr 


Thr 


Met 










290 










295 










300 


Glu 


He 


Glu 


Pro 


Val 


He 


Tvr 


Phe 


Tyr 


Asp 


Asn 


Asp 


Thr 


Arg 


Gly 










3 05 










310 










315 


Val 


Arg 


Ser 


Asp 


Gin 


Arg 


Gin 


Cys 


Val 


Phe 


Asp 


Asp 


Glu 


His 


Asn 










320 










325 










330 


Ser 


Lys 


Asp 


Phe 


Lys 


Ser 


Leu 


Gin Gly 


Tyr 


Val 


Tyr 


Met 


He 


Glu 










335 










340 










345 


Asn 


Cys 


Gin 


Ser 


Glu 


Cvs 


His 


Gin 


Glu 


Tyr 


Leu Val 


Arg 


Tyr 


Cys 










350 










355 








360 


Asn 


Cys 


Thr 


Met 


Asp 


Leu 


Leu 


Phe 


Pro 


Pro 


Asp 


Leu 


Leu 


He 


Tyr 




His 






365 










370 










375 


Ser 


Asn 


Pro 


Gly 


Glu 


Lys 


Glu 


Phe 


Val 


Arg 


Asn 


Gin 


Phe 


Gin 










380 










385 










390 


Gly 


Met 


Ser 


Cys 


Lys 


Cys 


Phe 


Arg Asn 


Cys 


Tyr 


Ser 


Leu 


Asn 


Tyr 










395 










400 










405 


lie 


Ser 


Asp Val 


Arg 


Pro 


Ala 


Phe 


Leu 


Pro 


Pro 


Asp 


Val 


Tyr 


Ala 










410 










415 






420 


Asn 


Asn 


Ser 


Tyr 


Val 


Asp 


Leu 


Asp Val 


His 


Phe 


Arg 


Phe 


Glu 


Thr 










425 










430 








435 


He 


Met 


Val 


Tyr 


Arg 


Thr 


Ser 


Leu 


Val 


Phe 


Gly 


Trp 


Val 


Asp 


Leu 










440 










445 






450 


Met 


Val 


Ser 


Phe 


Gly 


Gly 


He 


Ala Gly 


Leu 


Phe 


Leu 


Gly 


Cys 


Ser 










455 










460 








465 


Leu 


He 


Ser 


Gly 


Met 


Glu 


Leu 


Ala 


Tyr 


Phe 


Leu 


Cys 


He 


Glu 


Val 










470 










475 








480 


Pro 


Ala 


Phe 


Gly 


Leu 


Asp 


Gly 


Leu 


Arg 


Arg 


Arg 


Trp 


Lys 


Ala 


Arg 










485 










490 








495 


Arg 


Gin 


Met 


Asp 


Leu 


Gly Val 


Thr 


Val 


Pro 


Thr 


Pro 


Thr 


Leu 


Asn 










500 










505 










510 


Phe 


Gin 


Gin 


Thr 


Thr 


Pro 


Ser 


Gin 


Leu 


Met 


Glu 


Asn 


Tyr 


He 


Met 


Gin 








515 










520 








525 


Leu 


Lys 


Ala 


Glu 


Lys 


Ala 


Gin 


Gin 


Gin 


Lys 


Ala 


Asn 


Phe 


Gin 










530 










535 








540 


Asn 


Trp 


His 


Arg 


He 


Thr 


Phe 


Ala 


Gin 


Lys 


His 


Val 


He 


Gly 


Lys 










545 










550 








555 



<210> 28 

<211> 2080 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 1416107CB1 

<400> 28 

ggcggttcag gcgccagagc tggccgatcg gcgttggccg ccgacatgac gcccgaggac 60 

ccagaggaaa cccagccgct tctggggcct cctggcggca gcgcgccccg cggccgccgc 120 

gtcttcctcg ccgccttcgc cgctgccctg ggcccactca gcttcggctt cgcgctcggc 180 

tacagctccc cggccatccc tagcctgcag cgcgccgcgc ccccggcccc gcgcctggac 240 

gacgccgccg cctcctggtt cggggctgtc gtgaccctgg gtgccgcggc ggggggagtg 300 
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ctgggcggct 

cccttcgtgg 
ggccgcctcc 
tccgaaatcg 
gtcgtcggca 
gtgctgggct 
ccgcgcttcc 

tggggctccg 

gccctgctgc 
ttccagcagc 
gccaagttca 
acagctgtgg 
ggtgtggtca 
ggccctggca 
gccagcgtgg 
gcggtgggct 
aagggcgtgg 
aaggagttca 
gctttctgca 
actctggaac 
agcaagcctg 
cagaatccag 
cagcccatga 
tgaggactca 
ctaaagcagc 
ttggaggttg 
aaatttgttt 
gcaagctcag 
ttgctcatgg 
gggctcagtt 



ggctggtgga 
ccggctttgc 
tcaccggcct 
cctacccagc 
tcctcctggc 
gcgtgccccc 
tgctgactca 
agcagggctg 
ggcagcccgg 
tgtcgggggt 
aggacagcag 
cggctctcat 
tggtgttcag 
actcctcgca 
ggctggcctg 

gggggcccat 

cgacaggcat 
gcagcctcat 
tcttcagtgt 
aaatcacagc 
tgactccaag 
ccccttggag 
cccggggcta 
ggaacacctt 
ggaagaggag 
ggtgctgggc 
gccaaataaa 
tttgaaaagg 
tcagccaagc 
ccctgggtca 



ccgcgccggg 
cgtcatcacc 
ggcctgcggt 
agtccggggg 
ctacctggca 
ctccctcatg 
gcacaggcgc 
ggaagacccc 
catctacaag 
caacgccgtc 
cctggcctcg 
catggacaga 
cacgagtgcc 
cgtggccatc 
gctggccgtg 
cccctggctc 
ctgcgtcctc 
ggaggtcctc 
ccttttcact 
ccattttgag 
ctgggcccaa 
ccttggtctg 
ggaggctcac 
cgagctttgc 
gtgggcctct 
attcagtcgc 
gactgacaca 
gtttattccc 
ttacccttca 
tcagccatca 



cgcaagctga 
gcggcccagg 
gttgcctccc 
ttgctcggct 
ggctgggtgc 
ctgcttctca 
caggaggcca 
cccatcgggg 
cccttcatca 
atgttctatg 
gtcgtcgtgg 
gcagggcgga 
ttcggcgcct 
tcggcgcctg 
ggcagcatgt 
ctcatgtcag 
accaactggc 
aggccctatg 
ttgttctgtg 
gggcgatgac 
gcccagagcc 
cagggtccct 
tgcctcctgt 
agacctgcgg 
aggatctttg 
tcctctcacg 
gaaaatcagg 
atcactgccc 
cactgagaag 
aatcttgttg 



gcctcttgct 
acgtgtggat 
tagtggcccc 
cctgtgtgca 
tggagtggcg 
tgtgcttcat 
tggccgccct 
ctgagcagag 
tcggcgtctc 
cagagaccat 
gtgtcatcca 
ggctgctcct 
acttcaagct 
tctctgcaca 
gcctcttcat 
agatcttccc 
tcatggcctt 
gagccttctg 
tccctgaaac 
agccactcac 
cctgcctgcc 
ccttcctgtc 
tccagctcct 
tcagccctcc 
tcttctggct 
cggctgcctt 
tcagtgtctc 
aggacaccct 
tcatttctgg 



<210> 29 
<211> 2128 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: 



1682513CB1 



<400> 29 

ctggccctag 

actacctgac 

acacagtgct 

ttaccaagat 

tggaggccgt 

agattgggaa 

agtggcgcaa 

tggtcatctt 

accgcaccac 

tcctgttctt 

ctctcttcat 

tctcagcagc 

tggtcctggg 

atagcatcat 

tgctcttcat 

tgaaggtgtg 

acagcgagac 

acctggagat 

acatcatcct 

tgggccaggt 

tggacattga 

tcaccgtggg 

aggtgaactg 

atgagaccta 

ggtcctcggt 

tggtgcctct 



ggagctgccc 
ggagaacccc 
gcatgcgctg 
gtacgacctg 
gctcaacaac 
ccgccacgag 
gttcggggcc 
cactctcacc 
ggtggactac 
cttcaccaac 
tgatggctcc 
cctctacctg 
ctggatgaat 
gatccagaag 
gatcggctac 
caatggggac 
cttcagcacc 
gctgagcagc 
cacctttgtg 
ctccaaggag 
gcgctccttc 
caagagctcg 
gtctcactgg 
ccagtattat 
ggtaccccgc 
ggacagcacg 



ctgtcgctgg 
cacaagaagg 
gtggccattg 
ctgctgctca 
gacggcctct 
atgctggctg 
gtctccttct 
gcctactacc 
ctgcggctgg 
atcaaagact 
ttccagctgc 
gcagggatcg 
gccctttact 
attctcttca 
gcttcagccc 
cagaccaact 
ttcctcctgg 
accaagtacc 
ctgctcctca 
agcaagcaca 
cccgtattcc 
gacggcactc 
aaccagaact 
ggcttctcgc 
gtggtggaac 
gggaaccc cc 



ctgcctgcac 
cggacatgcg 
ctgacaacac 
agtgtgcccg 
cgcccctcat 
tggagcccat 
acatcaacgt 
agccgctgga 
ctggcgaggt 
tgttcatgaa 
tctacttcat 
aggcctacct 
tcacccgtgg 
aggacctttt 
tggtctccct 
gcacagtgcc 
acctgtttaa 
ccgtggtctt 
acatgctcat 
tctggaagct 
tgaggaagtc 
ctgaccgcag 
tgggcatcat 
ataccgtggg 
tgaacaagaa 
gctgcgatgg 



gtgctccgtg 
gctgctgggg 
ggtctacatc 
gctaatggtc 
ctggctggct 
gc cc gagacc 
gcggttcctg 
ctttcacctg 
cctgatggcc 
ctttgaagag 
ggtgctgttc 
ggtcttgtca 
gacccagggt 
gcctgttgat 
cgccggcttt 
tctgcatgtc 
tctcgtgacc 
gcttgcctcc 
taaaggaaag 
taggggatgg 
cc aggggagc 
atgctccctc 
gctgctgctc 
atgcgcaaga 
ggaggtgctt 
atcgggaagg 
tgggctttgt 
gtggctttac 
ctacttcctt 



caaccagccc 
gcgccaggac 
ccgtgagaac 
cctcttcccc 
gatggctgcc 
caatgaactg 
ggtctcctac 
gggcacaccg 
cattacgctc 
gaaatgccct 
ctactctgtc 
ggccgtgatg 
gctgaagctg 
ccgattcctg 
cctgaacccg 
cacttacccc 
gctgaccatc 
catcatcctg 
tgccctcatg 
gcagtgggcc 
cttccgctct 
gtggtgcttc 
caacgaggac 
ccgcctccgc 
ctcgaacccg 
ccaccagcag 
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2080 



cacattgtca 
tcgcgaggca 
accaagtttg 
gacagcaacc 
aagacgggca 
ctgcgggaca 
ctgtgtgcca 
ccgtaccctt 
ttcactgggg 
ggagtgaatt 
ctggtgatcg 
gtctttgccc 
acggggacct 
ctcgtctact 
tgtgccaaca 
tcgtgccgtg 
ggcatgggcg 
ctggtgacct 
ggcgagacag 
accaccatcc 
ggggagatgg 
agggtggatg 
ccgggcaaga 
agggatcgct 
gacgaggtgg 
ggttaccccc 
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30/49 



0146258A2_I_> 



WO 01/46258 



PCT/US00/35095 



gcaagtggag 
cccactcatt 
ctgctttggc 
ggtcccctgg 
tatggagtca 
ccattattta 
gcagaggcct 
cgcctgagct 
cctcggcgtg 
ataaatgttt 



gactgatgac 
tctagtccag 
cccagaggcg 
ctctgcctcc 
cataagccaa 
tttgctctgc 
taggaccccg 
gcatgcgcca 
gggccatgcc 
attcattgaa 



gccccgctct 
ccgcatttca 
agggaccagt 
ccaccctggg 
cgccagagcc 
tctcaggaag 
ttccaagtgc 
ccatttttgg 
ttctgtgtgt 
aaaaaaaa 



agggactgca 
gcagtgcctt 
ggaggtgcca 
gtgggggctc 
cctccacctc 
cgacgtgacc 
actgcccggc 
cagcgtggca 
tctgtagtgt 



gcccagcccc 
ctggggtgtc 
gggaggcccc 
ccggccacct 
aggccccagc 
cctgccccag 
caagccccag 
gctttgcaag 
ctgggatttg 



agcttctctg 
cccccacacc 
aggaccctgt 
gtcttgctcc 
ccctgcctct 
ctggaacctg 
cctcagcctg 
gggctggggc 
ccggtgctca 



1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2128 



<210> 30 

<211> 2825 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 
<223> Incyte ID No: 



2446438CB1 



<400> 30 

cgttgtgcac 

gtcctgacag 

agggtgaccg 

ggttccgccg 

agcccctgct 

cggggctcca 

ctccatctgc 

ccctccagct 

gaggcggaca 

cagggcgagg 

acaggtgcca 

tcccggggtg 

tacctcaccg 

gtgctgaacc 

gactctggca 

cacagcgctc 

gagaatgggg 

acttgctttt 

gtggtaagct 

cagggcaaca 

gcactggtga 

gtgcagcttg 

gagggcaaga 

ctttcccgaa 

gcttctgtgg 

agcccgcacc 

tgggatctgc 

atcttcaccg 

aaagcggagg 

atctacctcc 

tcgttcatag 

tcccaggtgc 

gtgctgggct 

agtgtcatga 

gtcttccttt 

gaagctccta 

gagggcaacg 

accatcggca 

ctgctgctgc 

ctcatgagcg 

aaagccatct 

gcaggtgtga 

ttcagggtgg 

gacccgtcag 

aaggaggatg 

aactgatggc 



gtaattcggc 
gggagagtta 
agagaccaga 
ctcctctgct 
actgagaagc 
gtcaggccaa 
acagaggtcc 
ctccagtttt 
gaggaaagct 
accggaaatt 
gtcagccgga 
tccccgagga 
actcggaata 
ttaaggacgg 
atcctcagcc 
tgcacatcgc 
ccaatgtgca 
atttcggtga 
acctcctgga 
cagtcctgca 
ccagcatgta 
aggacatccg 
tcgagatttt 
agttcaccga 
acagctgtga 
gacaccgaat 
tcatccccaa 
ctgttgccta 
ttggaaactc 
tcgtgggcca 
acagctactt 
tgtgtttcct 
ggctgaacct 
tccagaaggt 
tcggcttcgc 
caggccccaa 
gggcccagta 
tgggcgagct 
tggcctacgt 
agaccgtcaa 
ctgtcctgga 
tgctgaccgt 
aggaggtgaa 
gggcaggtgt 
aggatggtgc 
ccagatgcag 



tcgacgtgtg 
agctcccgtt 
acctgcttgc 
gtcagcgccg 
tccgggatcc 
caccgacgcg 
tggctggacc 
caggttggag 
ggattttggg 
cgcccctcag 
tccaaaccga 
tctggctgga 
cacagagggc 
ggtcaatgcc 
cctggtaaat 
cattgagaag 
tgcccgggcc 
gctacccctc 
gaacccacac 
tgccctagtg 
tgatgggctc 
caacctgcag 
caggcacatc 
gtggtgctat 
ggagaactca 
ggtcgttttg 
gttcttctta 
ccatcagcct 
catgctgctg 
gctgtggtac 
tgaaatcctc 
ggccatcgag 
gctttactat 
catcctgcgg 
tgtagccctg 
tgccacagag 
caggggtatc 
ggccttccag 
gctgctcacc 
cagtgtcgcc 
gatggagaat 
tggcactaag 
ctgggcttca 
ccctcgaact 
ctctgaggaa 
caggaggcca 



tccagatggt 
ctccaccgtg 
tggagcttag 
gcagcccctc 
cagcagccgc 
cagctgggag 
gagcagcctc 
acattagatg 
agcgggctgc 
ataagagtca 
tttgaccgag 
cttccagagt 
tccacaggta 
tgcattctgc 
gcccagtgca 
aggagtctgc 
tgcggccgct 
tctttggccg 
cagcccgcca 
atgatctcgg 
ctccaagctg 
gatctcacgc 
ctgcagcggg 
gggcctgtcc 
gtgctggaga 
gagcccctga 
aacttcc tgt 
accctgaaga 
acgggccaca 
ttctggcggc 
ttcctgttcc 
tggtacctgc 
acacgtggct 
gacctgctgc 
gtgagcctga 
tcagtgcagc 
ctggaagcct 
gagcagctgc 
tacatcctgc 
actgacagct 
ggctattggt 
ccagatggca 
tgggagcaga 
ctcgagaacc 
aactatgtgc 
gaggacagag 



cagtctctgg 
ccggctggcc 
tgctcagagc 
ccggcttcac 
cacgccctgg 
gaagacagga 
ctcctcctag 
caggccaaga 
ctcccatgga 
acctcaacta 
atcggctctt 
acctgagcaa 
agacgtgcct 
cactgctgca 
cagatgacta 
agtgtgtgaa 
tcttccagaa 
cttgcaccaa 
gcctgcaggc 
acaactcagc 
gggcccgcct 
ctctgaagct 
agttttcagg 

gggtgtcgct 

tcattgcctt 
acaaactgct 
gtaatctgat 
agcaggccgc 
tccttatcct 
gccacgtgtt 
aggccctgct 
ccctgcttgt 
tccagcacac 
gcttccttct 
gccaggaggc 
ccatggaggg 
ccttggagct 
acttccgcgg 
tgctcaacat 
ggagcatctg 
ggtgcaggaa 
gccccgatga 
cgctgcctac 
ctgtcctggc 
ccgtccagct 
cagaggatct 



tggctagcct 
aggtgggctg 
tggggaggga 
t tcctcccgc 
cctcagcctg 
cccttgacat 
gatgacctca 
agatggctct 
gtcacagttc 
ccgaaaggga 
caatgcggtc 
gaccagcaag 
gatgaaggct 
gatcgaccgg 
ttaccgaggc 
gctcctggtg 
gggccaaggg 
gcagtgggat 
cactgactcc 
tgagaacatt 
ctgccctacc 
ggccgccaag 
actgagccac 
gtatgacctg 
tcattgcaag 
gcaggcgaaa 
ctacatgttc 
ccctcacctg 
gctagggggg 
catctggatc 
cacagtggtg 
gtctgcgctg 
aggcatctac 
gatctactta 
ttggcgcccc 
acaggaggac 
cttcaaattc 
catggtgctg 
gctcatcgcc 
gaagctgcag 
gaagcagcgg 
gcgctggtgc 
gctgtgtgag 
ttcccctccc 
cctccagtcc 
ttccaaccac 
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PCT/US00/35095 



atctgctggc tctggggtcc cagtgaattc tggtggcaaa tatatatttt cactaaaaaa 2820 
aaaaa 2825 



<210> 31 
<211> 1718 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: 



2817822CB1 



<400> 31 
gcctcggtgt 
tgcatgtggc 
ctgccccagt 

gggggtgagt 

gcagagatgt 
gaccgcccca 
acacagcaca 
aagctgctgg 
ctgcaagtgc 
gccataacag 
gcatattatt 
aagtttgggc 
gacattaagg 
agaaacgaat 
gcagagactg 
tttcttttaa 
aatattgatc 
ctagcaactt 
aagtccaata 
tgacaacagg 
acatagcccc 
tgctttcctg 
ctctagagtg 
caaatcctca 
ttctgattta 
gaaaagaatt 
ttcctttccc 
tcttactcag 
cactccatca 



tcccacctag 
agggctgcgc 
ccccccggcc 
tccacaccac 
tctctagctt 
gcacctattt 
tccctgaagt 
aggacatgcc 
cgggctacag 
cacggaagtc 
cagaggtcct 
cc tggaaggc 
cccaggggta 
tccattttaa 
ttatgaattc 
tttcacaaac 
actcgtttaa 
gcagctactt 
acaagaccaa 
aacatacaag 
agcttggggt 
atcccctttg 
gaggttttca 
ggccccacct 
ccaaaccctc 
gctacctctt 
agagctcctc 
tttttttcct 
ataaaccact 



gggcgggcag 
agtggagcgg 
caggcggcca 
caccctgggt 
agccaaggcc 
cagacccatc 
gtaccgtgag 
acagatcttt 
cgagaacctg 
cagcgtgctt 
gtgttttctg 
ggtcctagac 
caaggtattc 
catttattca 
tggcgtggct 
atcaggcaat 
ggactttcca 
ccttttcaaa 
gtaagaatgt 
atactgtgaa 
ccaatccatc 
cgagatgctg 
aagtgcatca 
cagacctact 
caagtgattt 
gtatggaggt 
atggaatcaa 
ctgtcctacg 
tgcacgagaa 



cc aggggcac 
ccagtgggca 
acgatgtcta 
accctgagga 
tccacggacg 
ctggactacc 
gctcagttct 
ggtgagcagg 
gage tea tgg 
gtgtgcctgg 
caggataaga 
aacagcgacc 
tccaagttct 
ttcaccttca 
tatgaaatta 
ttccagggtt 
ctccattgca 
gcctcatgta 
ttcaacaatg 
tctagatgtt 
tgtccctggc 
tgggtgctaa 
tcagcattac 
gaatcagaat 
tgatgtattc 
acaaaagact 
gctgaagtca 
ctgcttccct 
aaaaaaaa 



ttccgctggc 
ggatgacgag 
ctgttgtgga 
agtttccggg 
eggagggecg 
tgcgcactgg 
acgaaatcaa 
tgtctcggaa 
tgcgcctggc 
tggaaactga 
agatgttcaa 
teatgeactg 
acctgaegta 
cctggtggtg 
aaagttgcca 
ggtctagagt 
actgatgeca 
tctcccagac 
cgttggcaag 
ctgacctaaa 
atgtgccttc 
cacctcagag 
ctgtgaactt 
ctctgggggt 
taattttgag 
gacctcttac 
gtcttcttct 
cactcccctt 



ccaagtgatc 
ccagacccct 
getgaaegtc 
ctcaaagctg 
cttcttcatc 
gcaagtgccc 
gcctttggtc 
gcagtttttg 
acgtgcagaa 
ggagcaggat 
gtctgttgtc 
cctggagatg 
ccccaccaaa 
atcctcagga 
teaaagecat 
cttgccacta 
etatatttge 
ccttctcttg 
agatgtgaga 
gatgtagtct 
atgtagtagg 
ctgtcctctt 
gctggaaata 
tggcacagca 
accatctcta 
atcaaggaac 
gagagcacat 
ctcctaagag 



<210> 32 
<211> 2000 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: 
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4009329CB1 



<400> 32 

gacgaatttg 

ctccagttgg 

ggccaccgga 

agggecagge 

gtgtgctttg 

gagctcacat 

geegcaaggt 

actgccacag 

ccagcctcct 

ttctgggagt 

tgaagctctc 

acatcttcag 

cactgtttgg 

accccttcat 



aaaccagggg 
aggggttegg 

gccgccagct 
cccgtggggg 
tgtgctgcta 
tagcccccag 
gtgtggcctg 
tgatgggggg 
ccctctggct 
caccgcagcc 
•ccacaacgtg 
tgccctggtg 
cgctggcgtg 
ggctgcctcc 



gtgtcctgtt 
gagaaccata 
gtttggaact 
cagatggccg 
atggcggaga 
tttccagctt 
aatgtctctg 
tacctggact 
gtcactctct 
aagtttttct 
geaggegtea 
gccttctctg 
ctggttacca 
aggcccttct 



tgaacttggt 
gaagaggaag 
gagctactgc 
gcagaaggct 
cagtgtctgg 
caggtgtgaa 
accgctgtga 
acctggaagg 
acgtttcctg 
gccccaactt 
ccttcctggc 
acccgcacac 
cagtggtggc 
tcagggacat 



gecagataga 
ggccgtgtct 
agaaagggaa 
gaatctgege 
gactaggggc 
ccagaccccc 
cttcatccgg 
catcttctgc 
gctgctctac 
gteggecatt 
atttgggaat 
agccggcctg 
eggaggcatt 
cgttttctac 



gtaactcgga 
tccgtggaca 
gtggagagta 
tgggcactga 
tegtctacag 
gtggtagact 
accaaccctg 
cacttccctc 
ctgtttctga 
tctaccacac 
ggtgcacctg 
gcccttgggg 
accatcctac 
atggtggctg 
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tgttcctgac 
acctgggctt 
ggcaacggag 
ccgaggagga 
cgctgttctt 
attacatgaa 
ctgtggagtt 
actggaaacg 
ccctgcagtc 
tggtggtgat 
agccccccag 
tcaacgcggc 
tgagcaacac 
tctcggattt 
gcggcatcat 
gaagccacac 
ccctggggct 
gcagagtcta 
tcattgaatt 
cctcactgca 



cttcctcatg 
gtatgtgttc 
aggatctctg 

cc gggtatct 

ctaccaggag 
gtggagaagg 
cctgctgctc 
gcccctcaac 
ggggacctat 
cgcaggcaca 
gcttcactgg 
cgccacagag 
tgtgctgggg 
cacactggct 
cttcaacatc 
agaagtgaag 
cagcctcgtc 
tggcttctgc 

tggagtgatt 

ggcaggagcc 



ctcttccgtg 
tatgtggtca 
ttctgcccca 
tctaatacca 
accacggctc 
aaatcagcat 
ctcacagtcc 
tgtctgcatc 
ggtgtctatg 
gccttggctt 
ctctttgctt 
gtggtgaaca 
ctcacgctgc 
cgccagggct 
ctcgtgggtg 
ctggagccag 
ttctccctgg 
ctgctcctct 
cacctgaaaa 



gcagggtcac 
ctgtgattct 
tgccagttac 
acagctatga 
agatcctggt 
ac tggaaagc 
ccgtcgtgga 
tggttatcag 
agataggcgg 
cagtgacctt 
tcctgggctt 
tcttgcggtc 
tggcctgggg 
acccacggat 
tggggctggg 
acggactgct 
tctcagtccc 
tctacctgaa 
gcatgtgact 



cctggcatgg 
ctgcacctgg 
tccagagatc 
ctacggtgat 
ccgggccctc 
cctcaaggtg 
cccggacaag 
ccccctggtt 
cctcgttccc 
ttttgccaca 
tctgaccagc 
cctgggtgtg 
gaacagcatt 
ggcgttctcc 
ctgcctgctc 
ggtgtgggtc 
attgcagtgc 
cttccttgtc 
gaagccgctt 



gctctgggtt 
atctaccaac 
ctctcagact 
gagtaccggc 
aatcccctgg 
ttcaagctgc 
gatgaccaga 
gtggtcctga 
gtctgggtcg 
tctgacagcc 
gccctgtgga 
gtcttccggc 
ggagatgcct 
gcctgctttg 
cagatctccc 
ctggcaggcg 
ttccagctca 
gtggccctcc 
agtgctgtgg 



<210> 33 

<211> 2216 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 6618083CB1 
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<400> 33 

gaaaactctt 

aacaataaca 

cgaggagaag 

ctcgtggcct 

gtggtgaatg 

catggacgtc 

ttcgccatcg 

aggaagcaca 

tgctcgctcc 

gatggaggcg 

atccgtggct 

cagcttctgg 

gtgattgtgg 

cgctacctgc 

ttgggtaaag 

aggagcatcc 

gtggtcaccg 

ttctatacca 

accttgagta 

cacctgggac 

accctcacca 

gtgggcattc 

acggtaaatg 
caaacaagcc 
cactaagctt 
atcctgattc 
catttccagg 
aggccacctg 
tccatcttca 
agccacaaga 
ttaaagcctc 
ttgcatcttc 
tggcaggaag 
aaagaagagg 
aggaacaggt 
gaaaacaaga 



cctgaaggag 
acaaaagttc 
atgaagaaag 
ccctcgcggg 
cccccacccc 
caatagaccc 
gtggacttgt 
ctttgctggc 
aggcaggagc 
tcgccctcag 
ctctggggca 
gcctgcccga 
tccc tgccgt 
tcttggagaa 
cagacgtttc 
gcctggtgtc 
tgattgtcac 
acagcatctt 
cagggggcat 
ggagacccct 
tcacgctgac 
tggccatcat 
tcagcattgt 
agattggact 
agccttctct 
tgttccaagt 
gctttaaatg 
cacctcccaa 
accccctgtg 
tggaccaggg 
agggaactta 
ctaagtggac 
atgctcagag 
gatggatctc 
cgatgtaaga 
gaggcagttt 



atgcagagga 
aaaacctgaa 
tgattcagcg 
cgccttcggc 
gtacatcaag 
agacactctg 
ggggacgtta 
caataatggg 
ctttgaaatg 
tgtgctcccc 
ggtgactgcc 
gctgctggga 
tgtccagctg 
gcacaacgag 
ccaagaggta 
cgtgctggag 
catggcctgc 
tggaaaagct 
cgagactttg 
cctcattggt 
cctgcaggac 
cgcctctttc 
atctgagtga 
catctgcata 
gttttttttt 
gtttgcaact 
ctgggctccc 
tcccagatca 
ttgacccagc 
tttagaagct 
cctgtctaag 
agggaagagc 
ctgaatggca 
ccaggagagg 
agacttgaca 
cctgctgcat 



agattcgaac 
aagtgaacca 
aaaaagaaat 
tcctccttcc 
gccttttaca 
actttgctct 
attgtgaaga 
tttgcaattt 
ctcatcgtgg 
atgtacctca 
atctttatct 
aaggagagta 
ctgagccttc 
gcaagagctg 
gaggaggtcc 
ctgctgagag 
taccagctct 
gggatccctc 
gctgccgtct 
ggctttgggc 
cacgccccct 
tgcagtgggc 
aaagttgacc 
tctgcctgaa 
tcctaagccc 
gtggctttct 
catcagtgtc 
cctgtcagcc 
acctgggcct 
tcatttaaac 
aaaagctgcc 
aagtccccag 
gagagactca 
gccaggaggc 
aggagttgaa 
attttatttg 



tggaggaaaa 
tgaagctcag 
tggactggtc 
tctacggcta 
atgagtcatg 
ggtctgtgac 
tgattggaaa 
ctgctgcatt 
gacgcttcat 
gtgagatctc 
gcattggcgt 
cctggccata 
cctttctccc 
tgaaagcctt 
tggctgagag 
ctccctacgt 
gtggcctcaa 
tggcaaagat 
tctctggttt 
tcatgggcct 
gggtccccta 
cagctgtttt 
ttcttcccca 
gttctttgct 
tcccaagact 
tttgactgta 
tatgggactc 
cctgccctcc 
tgctggctag 
tcacattgac 
acttagacca 
gggagccacc 

tgggcctgct 
cgcctgaggc 
attaggtgaa 
tgtgcataac 



ccctaaaata 
taaaaaggac 
ctgctcgctc 
caacctgtcg 
ggaaagaagg 
tgtgtccata 
ggttcttggg 
gctgatggcc 
catgggcata 
acccaaggag 
gttcactggg 
cctgtttgga 
ggacagccca 
ccaaacgttc 
ccacgtgcag 
ccgctggcag 
tgcaatttgg 
cccatacgtc 
ggtcattgag 
cttctttggg 
cctgagtatc 
cccagaagaa 
cccatgcaca 
aaccaaaaat 
ttttgcaatg 
gaacatgctg 
cc tggaggga 
gcttcc tcaa 
caatgacttt 
agtgtacagt 
tgagaccatc 
cgggaaagtg 
ctccatgatt 
agcttc tgtg 
agcaaagaaa 
cccaaggcag 



60 
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tggcagggaa gtctaataaa tgaggcaaaa taaaagagct tcacctttta aaaaaa 2216 

<210> 34 

<211> 1995 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7472002CB1 

<400> 34 

atgaccgaaa aaaccaatgg tgtgaagagc tccccagcca ataatcacaa ccatcatgca 60 
cctcctgcca tcaaggccaa tggcaaagat gaccacagga caagcagcag gccacactct 120 
gcagctgacg atgacacctc ctcagaactg cagaggctgg cagacgtgga tgccccacag 180 
cagggaagga gtggcttccg caggatagtt cgcctggtgg ggatcatcag agaatgggcc 240 
aacaagaatt tccgagagga ggaacctagg cctgactcat tcctcgagcg ttttcgtggg 3 00 
cctgaactcc agactgtgac cacacaggag ggggatggca aaggcgacaa ggatggcgag 3 60 
gacaaaggca ccaagaagaa atttgaacta tttgtcttgg acccagctgg ggattggtac 42 0 
tactgctggc tatttgtcat tgccatgccc gtcctttaca actggtgcct gctggtggcc 480 
agagcctgct tcagtgacct acagaaaggc tactacctgg tgtggctggt gctggattat 540 
gtctcagatg tggtctacat tgcggacctc ttcatccgat tgcgcacagg tttcctggag 600 
caggggctgc tggtcaaaga taccaagaaa ctgcgagaca actacatcca caccctgcag 660 
ttcaagctgg atgtggcttc catcatcccc actgacctga tctattttgc tgtggacatc 720 
cacagccctg aggtgcgctt caaccgcctg ctgcactttg cccgcatgtt tgagttcttt 780 
gaccggacag agacacgcac caactaccct aacatcttcc gcatcagcaa ccttgtcctc 840 
tacatcttgg tcatcatcca ctggaatgcc tgcatctatt atgccatctc caaatccata 900 
ggctttgggg tcgacacctg ggtttaccca aacatcactg accctgagta tggctacctg 9 60 
gctagggaat acatctattg cctttactgg tccacactga ctctcactac cattggggag 1020 
acaccacccc ctgtaaagga tgaggagtac ctatttgtca tctttgactt cctgattggc 1080 
gtcctcatct ttgccaccat cgtgggaaat gtgggctcca tgatctccaa catgaatgcc 1140 
acccgggcag agttccaggc taagatcgat gccgtgaaac actacatgca gttccgaaag 12 00 
gtcagcaagg ggatggaagc caaggtcatt aggtggtttg actacttgtg gaccaataag 12 60 
aagacagtgg atgagcgaga aattctcaag aatctgccag ccaagctcag ggctgagata 1320 
gccatcaatg tccacttgtc cacactcaag aaagtgcgca tcttccatga ttgtgaggct 1380 
ggcctgctgg tagagctggt actgaaactc cgtcctcagg tcttcagtcc tggggattac 1440 
atttgccgca aaggggacat cggcaaggag atgtacatca ttaaggaggg caaactggca 1500 
gtggtggctg atgatggtgt gactcagtat gctctgctgt cggctggaag ctgctttggc 1560 
gagatcagta tccttaacat taagggcagt aaaatgggca atcgacgcac agctaatatc 1620 
cgcagcctgg gctactcaga tctcttctgc ttgtccaagg atgatcttat ggaagctgtg 1680 
actgagtacc ctgatgccaa gaaagtccta gaagagaggg gtcgggagat cctcatgaag 1740 
gagggactgc tggatgagaa cgaagtggca accagcatgg aggtcgacgt gcaggagaag 1800 
ctagggcagc tggagaccaa catggaaacc ttgtacactc gctttggccg cctgctggct 1860 
gagtacacgg gggcccagca gaagctcaag cagcgcatca cagttctgga aaccaagatg 192 0 
aaacagaaca atgaagatga ctacctgtct gatgggatga acagccctga gctggctgct 19 80 
gctgacgagc cataa 19 95 

<210> 35 

<211> 988 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_:f eature 

<223> Incyte ID No: 1812692CB1 

<400> 35 

cttgggtgaa agaaaatcct gcttgacaaa aaccgtcact taggaaaaga tgtcctttcg 60 
ggcagccagg ctcagcatga ggaacagaag gaatgacact ctggacagca cccggaccct 12 0 
gtactccagc gcgtctcgga gcacagactt gtcttacagt gaaagcgact tggtgaattt 18 0 
tattcaagca aattttaaga aacgagaatg tgtcttcttt accaaagatt ccaaggccac 240 
ggagaatgtg tgcaagtgtg gctatgccca gagccagcac atggaaggca cccagatcaa 3 00 
ccaaagtgag aaatggaact acaagaaaca caccaaggaa tttcctaccg acgcctttgg 3 60 
ggatattcag tttgagacac tggggaagaa agggaagtat atacgtctgt cctgcgacac 420 
ggacgcggaa atcctttacg agctgctgac ccagcactgg cacctgaaaa cacccaacct 480 
ggtcatttct gtgaccgggg gcgccaagaa cttcgccctg aagccgcgca tgcgcaagat 540 
cttcagccgg ctcatctaca tcgcgcagtc caaaggtgct tggattctca cgggaggcac 600 
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ccattatggc 
ttcagaggag 
caccctcatc 
cacatggaag 
gccagccaga 
tgaaaacaag 
ttcattaaaa 



ctgatgaagt 
aatattgtgg 
aggaattgcg 
aaagaccatg 
tcatggggaa 
gatgacgtac 
tgtgttctca 



acatcgggga 
ccattggcat 
atgctgaggt 
gcatgggcct 
gtctgccttt 
ctaattaact 
gcaatctc 



ggtggtgaga 
agcagcttgg 
accggtggga 
gtggcctgaa 
caaggagtgc 
gctgggaaag 



gataacacca 
ggcatggtct 
caggaggagg 
ccctggggct 
ctttgggacc 
agttaacaat 



tcagcaggag 660 
ccaaccggga 720 
tctgctaggt 780 
ctgtgatgga 840 
ttaaaggaat 9 00 
gaatgttttg 960 
988 



<210> 36 

<211> 3179 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 3232992CB1 



<400> 36 

gcggagcggc 

ggggcgcggg 

tcgggcagcg 

gcagctactg 

gcggggggca 

acaccgtggc 

ccggcctgct 

acctgggcga 

ccgtcacctt 

gggggctggt 

acctcttcac 

tgggcagtgg 

actgggcatt 

tggtcccagc 

cctcatggct 

tggccacgtc 

tgcaccgcgc 

ccaaggacag 

cgggggcagg 

gtgccgtggg 

gcagcatcgt 

gggccatcac 

tggccttgca 

gctttatctc 

gcctgggcta 

tcgccactgc 

tggcgatgcc 

cacactccca 

aagctttctg 

gccctggcat 

ccacacggtt 

aagaagacag 

cttgggcaag 

acttgctggg 

agagcaggtg 

ggtatgcagg 

gcaaatgaag 

tcgggttggc 

caggggtgaa 

ggcaggtgca 

aggaccccac 

ccattcccag 

cacaaagctt 

acactggctg 

cccatccatg 

ggggagctta- 

tggtgcgggg 

agcttggaca 

taagcaggtg 



ggcgccggcg 
gcgcgggcgg 
taaggcgggc 
caaaggggcc 
agccgccgcc 
aggcgtcctt 
gcagtcagtg 
ccgcttcaac 
ctccagctcc 
gggcatcggg 
caagaacacg 
cc tgggctac 
gcgggtgtcc 
cactaaaagg 
ccgagatatg 
ggctgtctcc 
ccaagttgtg 
cctcatcttt 
agccacgcgc 
catgctgggc 
aggagcctat 
tgcagacatc 
gagcttcacc 
agacctgatc 
cgcgctcatg 
gctcttcttc 
gcccgcatct 
cctcgtctgg 
tgtgatccac 
caagaggagg 
ggacaggttc 
ccccaagtgg 
tccctgccct 
caaagcacga 
gcccaggcct 
gaccactgct 
ctgggcgccc 
tcccagcctg 
ggccctggct 
caccagccca 
tccttccccg 
aatccatggg 
cctgccccag 
ctggcattcc 
cacacaccag 
gccccctgcg 
gtggggtggg 
atgctcttct 
gaatactcac 



ccggggggcg 
cgctggagtc 
cccgaccgga 
ccggcgctca 
atcctcagct 
ctggacat cc 
ttcatctgta 
aggaaggtga 
ttcattcccc 
gaggccagct 
cgtacgctca 
attactggct 
cctgtcctgg 
ggtcatgccg 
aaggccctga 
ttcgccacgg 
cagaagacag 
ggggccatca 
tggtgccgcc 
tctgccatct 
atctgtatct 
ctcatgtacg 
tcccacctgc 
cgccagagca 
ctctgccctt 
gtcagcgacc 
gtgaaagtct 
gaggtgtcct 
ggctaggcac 
ctgtgtcctc 
ccagccctag 
gtgtccgggg 
ccctggaacg 
tctgcagctt 
cagggcggca 
cagctgggcc 
aagtctctgg 
gaggtcccag 
gcagctgtac 
actctgcagg 
aggctgagct 
gcagtagcca 
agctgaggct 
accaagtgac 
gatgcagctg 
tcacccactg 
gggtgaggcc 
tgccccttag 
ccaccaagct 



cagcgagggg 
tcggccgcgg 
ccccccggca 
gcagcccaaa 
tgggcaacgt 
agcagcactt 
gcttcatggt 
ttctcagctg 
agcagtactt 
actccaccat 
tgctgtccgt 
ccagcgtgaa 
gcatgatcac 
accagctcgg 
ttcgaaaccg 
gggccctggg 
cagagacgtg 
cctgctttac 
tgaagaccca 
tcatctgcct 
tcgtcgggga 
tggtcatccc 
tgggggacgc 
ctaaggactc 
tcgtcgtggt 
gcgccagggc 
gaggtggtgc 
acagcgtccg 
ccaccctctc 
agttaccctg 
gtttgggccg 
agagcctggc 
aagggccagg 
tgaagactca 
gtcccggctt 
tcggaccttg 
gtactccctg 
atggggactg 
accacctgtg 
gcttctctcc 
gagccttttc 
gggctccggc 
gaggccccgg 
cccaggggcc 
ccaacttcac 
cctgcacttc 
ttgtggccaa 
ttactggctg 
ctggggtacc 



ctggcggtag 
gcgatgaggt 
cccccggcac 
ccggccagct 
gctcaactac 
tggggtcaag 
ggctgccccc 
cggcattttc 
ctggctgctg 
cgcccccact 
cttctacttc 
gcaggcagcc 
aggaacactc 
ggaccagctc 
cagctacgtc 
catgtggatc 
caacagcccg 
gggatttctg 
gcgggccgac 
gatcttcgtg 
gacgctgctg 
cacgcggcgc 
cgggagcccc 
cccgctctgg 
cctgggcggc 
tgagcagcag 
cattgggaca 
ggaccggctg 
tggcccaggc 
gaaggatgtg 
cagggcccct 
c'tgccaccag 
gggctggact 
acagaccctg 
tgaggctcac 
gggatattgg 
gaggacactg 
ttctgacaag 
cccccaggct 
ctgccaccac 
caggggcagg 
tgctggagga 
gagaggcggc 
aggccttcga 
accagcccca 
tgctgcaatc 
tgggggaccc 
gctgtggctt 
ccgagggcct 



cggttgctgc 
gcagacgctg 
ccccggctgc 
tgggccgcgg 
ctggacaggt 
gaccgaggcg 
atcttcggct 
ttctggtcgg 
gtcctgtccc 
atcattggcg 
gccatcccac 
ggagac tggc 
atcctcattc 
aaggcccgga 
ttctcctccc 
ccgctctacc 
ccctgtgggg 
ggcgtggtca 
ccactggtgt 
gctgccaaga 
ttttctaact 
gccactgccg 
tacctcattg 
gagttcctga 
atgttcttcc 
gtgaaccagc 
atgaagaacc 
ggctgcccca 
ctgctgagtg 
tgtgttggag 
ggggccaagg 
cttatgtgat 
ttcccacaca 
gaccatacgg 
gcgagggcct 
acgcaacctg 
tctcactgtc 
ctggcatcac 
caaggtctct 
cccccaagcc 
gcccaggaga 
agcagctatc 
ccctacccaa 
tcacccacct 
acccgctttg 
aaggtggttc 
cccaagagcc 
cagtggtgtg 
gacaagagga 
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tggggtgggg gtggcatcct ccaaagacca gcctccaccc ccactccagc ctcagcgggg 3 00 0 
ccccagcgat gttttcttgt tgtacaagaa ccaggtccga gtgttgcctc ctcttccttc 3 060 
cggaagccaa actgctcctt tattttttag agctgctgat tgtgaatctc agagtcttaa 3120 
gagagaagcc aaatatattc ctcttgtaaa tgaagaaata aacctattta aatcacaaa 317 9 

<210> 37 

<211> 1986 

<212> DNA • 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 3358383CB1 

<400> 37 

ggagtatctg agcaaattat ttcttacgtg actttagaga aaacggctac ctatctgacc 60 
ccaaaacgac ttgaggaaac tgtttccacg gtcctgctgc aggggggaag cacagtcgtc 120 
aagaagagag tggggtcagg atcaaaacac atttagtgtg acttagggaa agaaaacatt 180 
ttccctcttt gaacctctct ggatacagtc attttgcctc tacttgagga tcaactgttc 240 
aacctcaatg gcctttcagg acctcctggg tcacgctggt gacctgtgga gattccagat 3 00 
ccttcagact gtttttctct caatctttgc tgttgctaca taccttcatt ttatgctgga 360 
gaacttcact gcattcatac ctggccatcg ctgctgggtc cacatcctgg acaatgacac 420 
tgtctctgac aatgacactg gggccctcag ccaagatgca ctcttgagaa tctccatccc 480 
actggactca aacatgaggc cagagaagtg tcgtcgcttt gttcatcctc agtggcagct 540 
ccttcacctg aatgggacct tccccaacac aagtgacgca gacatggagc cctgtgtgga 600 
tggctgggtg tatgacagaa tctccttctc atccaccatc gtgactgagt gggatctggt 660 
atgtgactct caatcactga cttcagtggc taaatttgta ttcatggctg gaatgatggt 72 0 
gggaggcatc ctaggcggtc atttatcaga caggtttggg agaaggttcg tgctcagatg 780 
gtgttacctc caggttgcca ttgttggcac ctgtgcagcc ttggctccca ccttcctcat 840 
ttactgctca ctacgcttct tgtctgggat tgctgcaatg agcctcataa caaatactat 9 00 
tatgttaata gccgagtggg caacacacag attccaggcc atgggaatta cattgggaat 9 60 
gfcgcccttct ggtattgcat ttatgaccct ggcaggcctg gcttttgcca ttcgagactg 1020 
gcatatcctc cagctggtgg tgtctgtacc atactttgtg atctttctga cctcaagttg 1080 
gctgctagag tctgctcggt ggctcattat caacaataaa ccagaggaag gcttaaagga 1140 
acttagaaaa gctgcacaca ggagtggaat gaagaatgcc agagacaccc taaccctgga 12 00 
gattttgaaa tccaccatga aaaaagaact ggaggcagca caaaaaaaaa aaccttctct 12 60 
gtgtgaaatg ctccacatgc ccaacatatg taaaaggatc tccctcctgt cctttacgag 13 2 0 
atttgcaaac tttatggcct attttggcct taatctccat gtccagcatc tggggaacaa 13 80 
tgttttcctg ttgcagactc tctttggtgc agtcatcctc ctggccaact gtgttgcacc 1440 
ttgggcactg aaatacatga cccgtcgagc aagccagatg cgtctcatgt acctactggc 1500 
aatctgcttt atggccatca tatttgtgcc acaagaaatg cagacgctgc gtgaggtttt 1560 
ggcaacactg ggcttaggag cgtcggctct gaccaatacc cttgcttttg cccatggaaa 1620 
tgaagtaatt cccaccataa tcagggcaag agctatgggg atcaatgcaa cctttgctaa 1680 
tatagcagga gccctggctc ccctcatgat gatcctaagt gtgtattctc cacccctgcc 1740 
ctggatcatc tatggagtct tccccttcat ctctggcttt gctttcctcc tccttcctga 1800 
aaccaggaac aagcctctgt ttgacaccat ccaggatgag aaaaatgaga gaaaagaccc 1860 
cagagaacca aagcaagagg atccgagagt ggaagtgacg cagttttaag gaattccagg 192 0 
agctgactgc cgatcaatga gccagatgaa gggaacaatc aggactattc ctagacacta 19 80 
gcaaat 1986 

<210> 38 

<211> 3294 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 4250091CB1 

<400> 38 

tgtaagacag gaaagggatc tatttgatgt ctatcttcag atatattggc agttttcctt 60 
aagctattta gttcctcatc tgttgctttt tcattttgta tactgcaagt tcccaggcaa 12 0 
ctcgaatttg caaacacagc catggataca ctatttacct tacagtagtt tcctgggaat 180 
ctaagtctgg tttttgttat tcttccctcc cctccactgc ataatcatgt ataactagca 240 
acatttatgg ttataggttg atttcctaag tgtggctgat ggtagcctct agtttgaagt 3 00 
gagggaagaa tgagtagtca ggaactggtc actttgaatg tgggagggaa gatattcacg 3 60 
acaaggtttt ctacgataaa gcagtttcct gcttctcgtt tggcacgcat gttagatggc 42 0 
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agagaccaag aattcaagat ggttggtggc cagatttttg tagacagaga tggtgatttg 480 

tttagtttca tcttagattt tttgagaact caccagcttt tattacccac tgaattttca 540 

gactatctta ggcttcagag agaggctctt ttctatgaac ttcgttctct agttgatctc 60 0 

ttaaacccat acctgctaca gccaagacct gctcttgtgg aggtacattt cctaagccgg 660 

aacactcaag cttttttcag ggtgtttggc tcttgcagca aaacaattga gatgctaaca 72 0 
gggaggatta cagtgtttac agaacaacct tcggcgccga cctggaatgg taactttttc 780 

cctcctcaga tgaccttact tccactgcct ccacaaagac cttcttacca tgacctggtt 840 

ttccagtgtg gttctgacag cactactgat aaccaaactg gagtcaggta ttttgtactt 900 

tgcagtattt ctcttgtata ccagtttgtg atgttttctc taaaaacttg aagttcctca 960 

ggcctgtaac ttctggaaaa gatgattatt caaaataatg ttttggggta accagtggag 102 0 

ttgggtagaa tgaccaaata attattttcc aaactgggat actttttaga gtgaaagggg 108 0 

c tat tat tag gtgggacaaa aggaataaat gaagactgcc cagaaaaaac tgagactatg 1140 
gacattcaaa tcatgggaga aaataatttt gtagattatg ttccattgct aatgaatttg 12 0 0 

acttagaaaa gaattgcctt atttttaaga gattgtttca gtggttcaca taaaggctcg 1260 

ctcactggtt tctcttgagt tccttacaca ctatataagt tgttctttca gttttatgat 1320 

tcaactactg tttttccttc agctgacttt atttttaaac acccttaaag acagatatat 13 80 

ctcatggcaa atttggtatc ctgttacagc cttggctctt aaacaactca aaatattggg 1440 

ataggctgtc agtatgttaa ggatagttgc tcctgagtca attcttcact tactccctct 1500 

gttgttcttg gctggatcct aacgctgatt tccactctgc tgtcacaaac atttttcccc 1560 

ccgtaaaatg tcttaatgct gtcctaccat tattttacca actgtgaaag ctggctttaa 162 0 

tttttaggag gaaaagaaaa gcctgcatgt gttctttatt ggtatcattt aaaatatact 1680 

tttttttttt ttttggtaaa ggtaggcgta ttttaagata ttttcttaac ttgagcagta 174 0 

gccaacagga aggataccag tgtctctctc tcttagcgac acactccttg gtcttgctta 1800 

ccaactggag gacactaggt agaataaccg agtatgacaa ttcttaattg tttacatttt 1860 

ataacttcct gtccttcaaa agagtttgaa atgtcatttt gggaaaagag agccagtcaa 1920 

gctagtaggc tgattgtgaa gaaaatctaa taccttatct ttatctcaaa cctctgtaca 1980 

actttatttt cattgatggg atactttaac aaaaatgaaa ttttttttgg tttttaaaat 204 0 

atgagtgatt atgacctctt tggggatcat gcttcaaaaa gtcagaaacc tagagacaaa 2100 

actgtcattg atttttaaga agaaacacac taggtcaaaa gaagatgtcc tggaaatacg* 2160 

aagtactctt taaaaaccat gcatttggag aaagtaattg tttccttgaa aaacatgatt 222 0 

aaaaactaaa actgggatgt tcctgtgtgt acacagtgcc aaatggtttt ccctttttat 2280 

gttgtgtttt agaaacagca cgaaagtttt ttccatttta aagtgagaaa acattatatt 2340 

tagacttcca taattccaaa atcagaagct atttttaaaa ttagcatttt cttgcatcac 2400 

caaatggtat tcaattgttt gaagctcaaa atttttacca ttccataaat gtttgtgaat 2460 

ttttagacag tgccaattta aaagtagaga tagccaatct gaatacggtg aaattatggg 252 0 

gatctctggt gattgggatg aaaactctgg ccttaaaagg tccactttta gtatataatt 2580 

gcctaattag caatcatttt tattttttgc tcactccctg gtctgaatct atctgtctat 2640 

tcagatattt tttggtaggt ttggaaaatg gagaagtgag cctaattggt gcctaattgt 2700 

ctggtgtatc attcacttta ttcagtttgt tctatcaata tgatttaccc ctcaaggtta 27 6 0 

acctagcagg ttgctcagtt attatctctc aaggtcacag tactagaaat acttggcttg 282 0 

catctttcag atgccattca tgttatcaag ctcaaattat agttggtcac aggattctaa 2880 

agtctttatt tgacttctcc tttttgaact ggctcaaatg gaaaagtgta gttgctttta 2940 

aatgttaaaa ataagtttaa actttatatt tcccattggt ttcccctatt ttgtcctttc 3 000 

tttgtgtgct tgaaatattt tatttttcag tttgtcctca tagggaatca agtattttag 3060 

ctaggtgatg tcttgcaagt acgttccact ttgttacaat ctactatctg tatatactat 312 0 

ttgtatctta attcttttat gagatgttct gtaacatttt tctcactttg acaaatgttt 3180 

ttagactgta cagtcaagat ctggcgcttg ggggtaagtg gaatgatttg ctaatattga 3240 

gaatctgttg tatcaaacat aataaacttt ttttgagatg tgaaaaaaaa aaaa 32 94 

<210> 39 

<211> 2043 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 70064803CB1 

<400> 39 

gcaacatggc ggctgccgtg gtgcagcgcc cgggctgagc gacagcaagt gcagcgggct 60 

cctaccccgg gtgaggggtg gcctccgcgt gggatcgtgc cctcttcagc ccgctcctgt 12 0 

ccccgacatc acgtgtattc cgcacgtccc ctccgcgctg tgtgtctact gagacgggga 180 

ggcgtgacag ggcccgggtc ccttctcagt ggtgctctgt gcttcagggc aagctccccg 240 

tctccgggcg cacttccctc gcctgtgttc ggtccatcct cctttctcca gcctcctccc 300 

ctcgcaggtg ggatcgtcgg tgggaccgga gcgcgggcgg gcgcggcccc ccgggaccat 3 60 

ggccgggtcc gacaccgcgc ccttcctcag ccaggcggat gacccggacg acgggccagt 42 0 

gcctggcacc ccggggttgc cagggtccac ggggaacccg aagtccgagg agcccgaggt 48 0 
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cccggaccag gaggggctgc 
agtggcggtg ctgtgctaca 
cgtccttccc gacatcgagc 
gaccgtgttc atctccagtt 
gtacaatcgg aagtatctca 
gtcatccttc atccccggag 
ggtcggggag gccagttatt 
cgaccagcgg agccggatgc 
gggctacatt gcaggctcca 
ggfcgacaccg ggtctaggag 
gccaagggga gccgtggagc 
ggcagatctg agggctctgg 
agtcctgggt gtgggcctgg 
ggctgatccc ctggtctgtg 
ccttgcctgc gcccgtggta 
cctcctgtcc atgaactggg 
ccgacgctcc accgccgagg 
gagcccctac ctcattggcc 
cttgtccgag ttccgggctc 
gggcggcgca gccttcctgg 
gctgcacgtg cagggcctgc 
ccagcggggc cgctccaccc 
tcacctacct gcacatctgc 
taaccccttg gcctggccca 
actacatggg tagctcaggg 
ggcagcccca agggctcggt 
aaa 

<210> 40 
<211> 1915 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
<223> Incyte ID No: 70356768CB1 

<400> 40 

caccactggg cgctgcgcgc tgcccttccc tccgcgcaca ggctgccggc tcaccgcttg 60 
ctaatggcag ccggggtctc cctgggacag caagacctcc gctcaggccc ctctttcgaa 120 
tgctccacgc cctcctgcga tctagaatga ttcagggcag gatcctgctc ctgaccatct 180 
gcgctgccgg cattggtggg acttttcagt ttggctataa cctctctatc atcaatgccc 240 
cgaccttgca cattcaggaa ttcaccaatg agacatggca ggcgcgtact ggagagccac 3 00 
tgcccgatca cctagtcctg cttatgtggt ccctcatcgt gtctctgtat cccctgggag 3 60 
gcctctttgg agcactgctt gcaggtccct tggccatcac gctgggaagg aagaagtccc 42 0 
tcctggtgaa taacatcttt gtggtgtcag cagcaatcct gtttggattc agccgcaaag 48 0 
caggctcctt tgagatgatc atgctgggaa gactgctcgt gggagtcaat gcaggtgtga 540 
gcatgaacat ccagcccatg tacctggggg agagcgcccc taaggagctc cgaggagctg 600 
tggccatgag ctcagccatc tttacggctc tggggatcgt gatgggacag gtggtcggac 660 
tcagggagct cctaggtggc cctcaggcct ggcccctgct gctggccagc tgcctggtgc 720 
ccggggcgct ccagctcgcc tccctgcctc tgctccctga aagcccgcgc tacctcctca 780 
ttgactgtgg agacaccgag gcctgcctgg cagagacggg ttctcgcttg tccaggctgg 840 
agtgctgtgg ctgttcatag gcatgacccc attgttgatc agcacggaag ttttcttctt 9 00 
ttttgttttt gtttttttgg ttttgtttgg gacggggtct cactctgtcg cccaggctgg 9 60 
agtggtgtga tctcggctcg ctgcagcctc cacctcccgg gcccaatcgg ttctcccgcc 102 0 
tcagcctcct gggtggctgg gactgctggc ccgtgccacc acgcttggct aatttttttt 1080 
tattattgta ttttttgtaa agatggagtt tcacctcttt gcctgggcag gtctcaaact 1140 
cctgagatca aatgatcctc cccccttggc ctcccaaagt gcgtggatta taggcatgag 12 00 
ccattgtatc tggctagcat gggagttttg aactgtccca tttccaacct gggccagtgc 1260 
attcctcctt aggcagcctg gtggtccctg ctcctgggat gtcactatat tgatgctgaa 13 20 
cttagtgcag acacctgatc tgcctagcgt actgcaaccc agagctcctg ggcccaggcg 13 80 
atcctcctgt ctcagcctcc tgagtagctg ggactctagg cacacaccac tatgcgtggc 1440 
tctccatgct tcttgggtct accctctgag atgtttttcc ttttctttca ccttccttga 1500 
ttccttctga agagggcgtt gcacaatgtg ctgcttttga tggttgagca aatttctcag 1560 
cctccttcct gcctatagag agttggggca ggctgggcgc cagctcacgc ctgtaatccc 1620 
agggaggctg aggcgggcag atcacgaggt caggacatca agaccggcct ggccgacatg 1680 
gtgggacccc atctctacta acaatacaaa aattggctgg gtatggtggc acgtgcctgt 1740 



38/49 



agcgcatcac cggcctgtct 
tcaatctcct gaactacatg 
agttcttcaa catcggggac 
acatggtgtt ggcacctgtg 
tgtgcggggg cattgccttc 
agcatttctg gctgctcctc 
ccaccatcgc gcccactctc 
tcagcatctt ctactttgcc 
aagtgaagga tatggctgga 
tggtggccgt tctgctgctg 
gccactcaga tttgccaccc 
caagaaatct catctttgga 
gtgtggagat cagccgccgg 
ccactggcct cctgggctct 
gcatcgtggc cacttatatt 
ccatcgtggc cgacattctg 
ccttccagat cgtgctgtcc 
tgatctctga ccgcctgcgc 
tgcagttctc gctcatgctc 
gcaccgccat cttcattgag 
tgcacgaagc agggtccaca 
gcgtgcccgt ggccagtgtg 
cacagctggc cctgggccca 
gcttccagag ggaccctggg 
gaggaggtgg gggtccagga 
gctatttgta acggaataaa 



cccggccgtt cggctctcat 540 
gaccgcttca ccgtggctgg 6 00 
agtagctctg ggctcatcca 660 
tttggctacc tgggtgacag 720 
tggtccctgg tgacactggg 780 
ctgacccggg gcctggtggg 840 
attgccgacc tctttgtggc 9 00 
attccggtgg gcagtggtct 9 60 
gactggcact gggctctgag 102 0 
ttcctggtag tgcgggagcc 1080 
ctgaacccca cctcgtggtg 1140 
ctcatcacct gcctgaccgg 1200 
ctccgccact ccaacccccg 1260 
gcacccttcc tcttcctgtc 1320 
ttcatcttca ttggagagac 13 80 
ctgtacgtgg tgatccctac 1440 
cacctgctgg gtgatgctgg 1500 
cggaactggc ccccctcctt 1560 
tgcgcgtttg ttggggcact 1620 
gccgaccgcc ggcgggcaca 1680 
gacgaccgga ttgtggtgcc 1740 
ctcatctgag aggctgccgc 1800 
ccccacgaag ggcctgggcc 18 60 
ccgtgtgcca gctcccagac 1920 
gggggatccc tctccacagg 1980 
atttgtagcc agacaaaaaa 2040 

2043 
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ggtcccggct gctggggagg ctgaggcggg agagttgctt gggcccggga ggcggaggtt 1800 
gcagtggcgg gagaattgct tggggcccgg gaggcggagg ttgcggtgag ccgagattgt 1860 
gccagtgcac actgcactcc agcctggtga cagagtgaga ctccgtcttc aaaaa 1915 

<210> 41 

<211> 1809 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_jf eature 

<223> Incyte ID No: 5674114CB1 

<400> 41 

atgggcctgg ccagggccct acgccgcctc agcggcgccc tggattcggg agacagccgg 60 
gcgggcgatg aagaggaggc cgggcccggg ttgtgccgca acgggtgggc gccggcaccg 120 
gtgcagtcac ccgtgggccg gcgccgcggt cgcttcgtca agaaagacgg gcactgcaac 180 
gtgcgtttcg taaacctggg tggccagggc gcgcgctacc tgagcgacct gttcaccaca 24 0 
tgcgtggacg tgcgctggcg ctggatgtgc ctgctcttct cctgctcctt cctcgcctcc 3 00 
tggctgctct tcggcctggc cttctggctc attgcctcgc tgcacggcga cctggccgcc 3 60 
ccgccaccgc ccgcgccctg cttctcacac gtggccagct tcctggccgc cttcctcttc 420 
gcgctggaga cgcagacgtc catcggctac ggcgtgcgca gcgtcaccga ggagtgcccg 480 
gccgctgtgg ccgccgtggt gctgcagtgc attgccggct gcgtgctcga cgccttcgtc 540 
gtgggtgctg tcatggccaa gatggccaaa cccaagaagc gcaacgagac gctggtcttc 600 
agcgagaacg ccgtcgtggc gctgcgcgac caccgcctct gcctcatgtg gcgcgtcggc 660 
aacctgcgcc gcagccacct ggtcgaggcc cacgtgcgtg cccagctgct gcagccccgt 720 
gtgaccccag agggtgagta catcccgctg gaccaccagg atgtggatgt gggctttgat 7 80 
ggaggcaccg atcgtatctt cctcgtgtcc cccatcacca tcgtccatga gatcgactct 840 
gccagtcctc tgtatgagct aggacgtgcc gagctggcca gggctgactt tgagctggtg 900 
gtcattctcg aggggatggt tgaggccaca gccatgacca cacagtgtcg ctcgtcctac 960 
ctccctggtg aactgctctg gggccatcgt tttgagccag ttctcttcca gcgtggctcc 1020 
cagtatgagg tcgactatcg ccacttccat cgcacttatg aggtcccagg gacaccggtc 1080 
tgcagtgcta aggagctgga tgaacgggca gagcaggctt cccacagcct caagtctagt 1140 
ttccccggct ctctgactgc attttgttat gagaatgaac ttgctctgag ctgctgccag 1200 
gaggaagatg aggacgatga gactgaggaa gggaatgggg tggaaacaga agatggggct 1260 
gctagccccc gagttctcac accaaccctg gcgctgaccc tgcctccatg atgcaaactg 1320 
atgtcccctt ccccgtgtat gcccccttcc ccaaggtagc aagatggagg gatggggctc 1380 
tctcctggga tgggggcagg tgttcctgaa taccgacagg cctgctgggt aaatgactag 1440 
gtggtaaggt tctgccatgc ctggtgaccc accatggaca tactggacct taattcctct 1500 
gcttctgtgc tccctcctga gaacccttta tgagcctgat tcctcagtct caccagaatt 1560 
ctggatcacc caagaggaaa agactggcag ttctagattc ctctatatgg ggagacctgg 1620 
attgttgacc agggtgagaa gccaatggta tagactgcct ctggggaagc aagttggcag 1680 
ttcttgaaca gcatcagata tcaagagttt gtaggtctgg attcacctaa gattcaaggg 1740 
agtgttgctt ctcaactcag ccaactgagt agcaaatcat ttgttctaga ccacctaagg 1800 
agggaaggt ~ 1809 

<210> 42 

<211> 1730 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 1254635CB1 

<400> 42 

ctttggccta ttataccatg gatgctaaaa atggttctaa ctgaaaaccc aaaccaagaa 60 

atagcaacaa gtctagaatt cttactacta caaaactcac ctggatccct aagggcacag 12 0 

caaagaatga gctattacgg cagcagctat catattatca atgcggacgc aaaataccca 180 

ggctacccgc cagagcacat tatagctgag aagagaagag caagaagacg attacttcac 240 

aaagatggca gctgtaatgt ctacttcaag cacatttttg gagaatgggg aagctatgtg 3 00 

gttgacatct tcaccactct tgtggacacc aagtggcgcc atatgtttgt gatattttct 3 60 

ttatcttata ttctctcgtg gttgatattt ggctctgtct tttggctcat agcctttcat 420 

catggcgatc tattaaatga tccagacatc acaccttgtg ttgacaacgt ccattctttc 480 

acaggggcct ttttgttctc cctagagacc caaaccacca taggatatgg ttatcgctgt 540 

gttactgaag aatgttctgt ggccgtgctc atggtgatcc tccagtccat cttaagttgc 600 

atcataaata cctttatcat tggagctgcc ttggccaaaa tggcaactgc tcgaaagaga 660 
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gcccaaacca 
ctcatgtggc 
caacttctcc 
aaat tagtca 
catgagagcc 
ttggtgacat 
tatgttcccc 
aagtattaca 
tgcagtgcca 
gttcgagaat 
attgtcagca 
gaaacacctt 
tagtcctaaa 
gtcgttgtaa 
aagtagagta 
aaaatttttc 
tttatctttt 
cggggaatga 



ttcgtttcag 
gcattggtga 
gctatacaga 
acgaccaaat 
ctctgtatgc 
ttatctatac 
gagaaattct 
aagtgaactg 
agcaattgga 
cctgcacgtc 
gctgtgaaaa 
atcagaaagc 
ttgcaattat 
acgtggcttt 
agttaaactt 
ttgttcgcca 
ttattatctt 
aggcaggaag 



ctactttgca 
ttttcggcca 
agacagtgaa 
catcctggtc 
ccttgaccgc 
tggtgattcc 
ctggggccat 
cttacagttt 
ctggaaagac 
ggacaccaag 
ccctgaggag 
tctcctgact 
gagggctacc 
tttgaaagtg 
ggtaaaagat 
attttgtatt 
acatgcttgt 
gaggctggaa 



<210> 43 

<211> 1147 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 1670595CB1 



cttataggta 
aaccacgtgg 
gggaggatga 
accccggtaa 
aaagcagtag 
actggaacat 
aggtttaatg 
gaaggaagtg 
cagcagctcc 
gcgagacgaa 
accaccactt 
ttaaacagaa 
actgaatcat 
ttatggctat 
aatctaaaaa 
aagaatgcta 
atcttcagtt 
taaataaaaa 



tgagagatgg 
tagaaggaac 
cgatggcatt 
ctattgtcca 
ccaaagataa 
ctcaccaatc 
atgtcttgga 
tggaagtata 
acatagaaaa 
ggtcatttag 
ccgccacaca 
tctctgtaga 
tttatctttc 
gttttatgat 
ttccatagtt 
ttaagcctaa 
ggaggtgtag 
taaaatgatt 



gaagctttgc 
agttagagcc 
taaagacctc 
tgaaattgac 
ctttgagatt 
tagaagctcc 
agttaagagg 
tgcccccttt 
agcaccacca 
tgcagttgcc 
tgaatatagg 
atcccaaatg 
agccaatcaa 
gatgctgggt 
ctcagttatt 
ttgattaaaa 
tattcaaaaa 



720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1730 



<400> 43 

gcagctgtct 

ggatcgcccc 

tacctgtgct 

gaggaggggg 

aagctgtcca 

gtttttctga 

ctgctggttg 

aatggagccg 

ccacagggcc 

ggctccctgc 

cgggcctcat 

tttgtggctg 

tccctgccgc 

ttgggcgctg 

tttgctgtcc 

ggaccccagc 

ctatgccctg 

aggcctcctc 

caccggggtg 

aaagacc 



tttccggccc 
cgagcgctgc 
cagccatggc 
ttgccctccc 
ccttcctggg 
ggattgggtt 
cctacttcat 
tgcagggggg 
cagtgggctc 
tgctgggcct 
tcctcacatt 
tggggccgag 
cccggtttgg 
gctatgctga 
tctttaacgg 
cgggcgat cc 
cttttctttc 
tgtgactctg 
gtgatgtttt 



ccgtgcactc 
gtcctgcggg 
cagcgagagc 
tgccaatggg 
tgtggtggtg 
cgtggtgggt 
cctggcactc 
cggagcctac 
cgggtcctgc 
tgtgggtggg 
cctgctggtc 
ggacatccgc 
ccacttcacc 
ggactacacc 
caggcatcat 
ctctgggcac 
tctccagcct 
ggctacctca 
cgttctgttt 



tccgcccgag 
tgggtcacct 
tcacctctgc 
gccgggggtc 
cccactgtcc 
catgctgggc 
accgtcctct 
tgtatcctcc 
cccagggcta 
gtctgcacct 
tctggctccc 
ttgactccta 
ggcttcaaca 
acgggagccg 
ggctggggcc 
gatcgtcgcc 
cccttcactg 
gtttccccat 
tatttttcta 



gcggagcccc 
aacccatttg 
tggcctaccg 
ctggaggggc 
tgtccatgtt 
tactgcaggc 
ctgtctgtgc 
aacatcgatg 
cggcttggaa 
tgggagccgg 
tggcctctgt 
ggcctggccc 
gcagtaccct 
tgatgaattt 
aacatgtcag 
gtcgcctaca 
gtgccttgat 
tttggccaga 
actctgcatg 



cggctcgcgg 
tggcttcctc 
gctcctgggg 
gtctgcccgg 
cagcatagtt 
cctggccatg 
catcgccacc 
gactgggatg 
cctgctgtat 
cctctatgcc 
gctcatcagt 
caatggctcc 
gaaggacaac 
tgccagcgtc 
gggagctgaa 
ccttcttcgt 
gctaggggcc 
ctcaccggcc 
accatgaata 



60 

120 

180 

240 

300 

360 

420 

480 

540. 

600. 

660. 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1147 



<210> 44 
<211> 2745 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 
<223> Incyte ID No: 



1859560CB1 



<400> 44 

cggcgacgcc 

agccacaggt 

tcgagcgccg 

cggagccccg 

tagagatgcc 

ccccgagcgc 

ggctgcccag 



agggacccca 
gggggctgag 
cagtccaccg 
ctggatctcc 
ttcttcggtg 
ctgctgctgc 
ctactccctg 



cgcatcccga 
cgaggcgtgg 
tagcgggtgg 
tggctgccac 
acggcgctgg 
tcccctgcgg 
cagtggctga 



gtgaagcaac 
cctcaggagc 
agcccgcctt 
ccgcaccccc 
gtcaggccag 
ccctgcagag 
agatggattt 



tagaactcca 
ggaggacccc 
ggtgcgcagt 
cgccagccta 
gtcctctggc 
gaggctgccc 
cgtcgccggc 



gggctgtgaa 60 
ccactctccc 120 
tggaaaacct 180 
cgccccaccg 240 
cccgggatgg 3 00 
atcctggcgt 3 60 
ctctcagttg 420 
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gcctcactgc cattccccag gcgctggcct atgctgaagt ggctggactc ccgccccagt 480 
atggcctcta ctctgccttc atgggctgct tcgtgtattt cttcctgggc acctcccggg 540 
atgtgactct gggccccacc gccattatgt ccctcctggt ctccttctac accttccatg 600 
agcccgccta cgctgtgctg ctggccttcc tgtccggctg catccagctg gccatggggg 660 
tcctgcgttt ggggttcctg ctggacttca tttcctaccc cgtcattaaa ggcttcacct 720 
ctgctgctgc cgtcaccatc ggctttggac agatcaagaa cctgctggga ctacagaaca 780 
tccccaggcc gttcttcctg caggtgtacc acaccttcct caggattgca gagaccaggg 84 0 
taggtgacgc cgtcctgggg ctggtctgca tgctgctgct gctggtgctg aagctgatgc 900 
gggaccacgt gcctcccgtc caccccgaga tgccccc tgg tgtgcggctc agccgtgggc 9 60 
tggtctgggc tgccacgaca gctcgcaacg ccctggtggt ctccttcgca gccctggttg 102 0 
cgtactcctt cgaggtgact ggataccagc ctttcatcct aacaggggag acagctgagg 108 0 
ggctccctcc agtccggatc ccgcccttct cagtgaccac agccaacggg acgatctcct 1140 
tcaccgagat ggtgcaggac atgggagccg ggctggccgt ggtgcccctg atgggcctcc 12 00 
tggagagcat tgcggtggcc aaagccttcg catctcagaa taattaccgc atcgatgcca 1260 
accaggagct gctggccatc ggtctcacca acatgttggg ctccctcgtc tcctcctacc 132 0 
cggtcacagg cagctttgga cggacagccg tgaacgctca gtcgggggtg tgcaccccgg 1380 
cggggggcct ggtgacggga gtgctggtgc tgctgtctct ggactacctg acctcactgt 1440 
tctactacat ccccaagtct gccctggctg ccgtcatcat catggccgtg gccccgctgt 1500 
tcgacaccaa gatcttcagg acgctctggc gtgttaagag gctggacctg ctgcccctgt 1560 
gcgtgacctt cctgctgtgc ttctgggagg tgcagtacgg catcctggcc ggggccctgg 162 0 
tgtctctgct catgctcctg cactctgcag ccaggcctga gaccaaggtg tcagaggggc 1680 
cggttctggt cctgcagccg gccagcggcc tgtccttccc tgccatggag gctctgcggg 1740 
aggagatcct aagccgggcc ctggaagtgt ccccgccacg ctgcctggtc ctggagtgca 1800 
cccatgtctg cagcatcgac tacactgtgg tgctgggact cggcgagctc ctccaggact 1860 
tccagaagca gggcgtcgcc ctggcctttg tgggcctgca ggtccccgtt ctccgtgtcc 1920 
tgctgtccgc tgacctgaag gggttccagt acttctctac cctggaagaa gcagagaagc 1980 
acctgaggca ggagccaggg acccagccct acaacatcag agaagactcc attctggacc 2040 
aaaaggttgc cctgctcaag gcataatggg gccacccgtg ggcatccaca gtttgcaggg 2100 
tgttccggaa ggttcttgtc actgtgattg gatgctggat gccgcctgat agacatgctg 2160 
gcctggctga gaaacccctg agcaggtaac ccagggaaga gaaggaagcc aggcctggag 222 0 
gtccacggca gtgggagtgg ggctcactgg cttcctgtgg gatgactgga aaatgacctc 2280 
gctgctgttc cctggcatga ccctctttgg aagagtggtt tggagagagc cttctagaat 2340 
gacagactgt gcgaggaagc aggggcaggg gtttccagcc cgggctgtgc gaggcatcct 240 0 
ggggchggca gcaccttccc ggctcaccag tgccacctgc gggggaggga cggggcaggc 2460 
aggagtctgg gaggcgggtc cgctcctctt gtctgcggca tctgtgctct ccgagagaaa 2520 
accaaggtgt gtcaaatgac gtcaagtctc tatttaaaaa taattttgtg ttttctaaat 2580 
ggaaaaagtg atagctttgg tgattttgta aaagtcataa atgcttattg taaaaaatac 264 0 
aggaaaccac ccctcaccct gtccacttgg gtgatcattc cagacccctc cccaaacatg 2700 
catatgtacc tgtccgtcag tgtgtggatg tatgtttaca gttct ■ 2745 

<210> 45 

<211> 3204 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 5530164CB1 

<400> 45 

cgacctctgg agctactgcg cctgcaagcc cagcctctct gcgccgcagg ctgcggggcc 60 
agctggcgcc gcacaaatac ggggcgggac acggggcggg acacgggccg gtcccggggg 12 0 
agggcctgag ccgcacagcc cgcccagggg tggtgcgtgt aaacgggcgt ctggatcccc 180 
gaatggttgc gtgtttccgt gtgtgggtcc gggggaggcc cacgaacgcc agcgaaaccg 240 
ctgacaccac cgcccaacta tgaactcatc aggcgcctga agaccgacac gccgaacatg 3 00 
cgccgcgcgc actcgcgcac gagtgagatc atcgcgcccc ggtcgtgagt gcgctcacac 3 60 
gcagcctgag actcgacggg agggggtcac gtggaagtat ctgagagagg cgtacttggc 420 
cactaggaaa gcacctcccc ctttccaaaa atgctccgga agtgccttcg ccctccgtaa 480 
agatggccgg ggcagtcggc acgagggagg cggggatgcg cctgcgcaac aagttcggcg 540 
gggaagatgg cggatgacaa ggattctctg cctaagctta aggacctggc atttctcaag 600 
aaccagctgg aaagcctgca gcggcgtgta gaagacgaag tcaacagtgg agtgggccag 660 
gatggctcgc tgttgtcctc cccgttcctc aagggattcc tggctggcta tgtggtggcc 720 
aaactgaggg catcagcagt attgggcttt gctgtgggca cctgcactgg catctatgcg 780 
gctcaggcat atgctgtgcc caacgtggag aagacattaa gggactattt gcagttgcta 840 
cgcaaggggc ccgactagct ctaggtgcca tggaagaggc aggatgagca gctcagcctt 900 
caggtggaga cactttatct ggattcccca gctgtcatcc atttgctatc tccaactttc 960 
ctgccacctt catccttgcc tcccttcctg cagattgtgg acagtagttc ctcagcctgc 1020 
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accctggatt 
aaggaccacc 
cctcagtatt 
cttgattttc 
tgggtatgag 
ggaccctgat 
tgtgccatgt 
tgaccaagct 
ggcccccacc 
acaacaacct 
gtaatctcaa 
ctgtgcgtca 
ggggccttca 
ccatgcccct 
caggcttgtc 
ttcagaacct 
gcggctctgg 
gccaggcact 
gcctctttgt 
ggctgcagct 
tgtactatgg 
gggcgccacc 
acaagtgcc t 
ggtggatgaa 
ccatccccca 
aggcctgaga 
agttgatgaa 
cagagccact 
tcctcaccct 
ccaaaccact 
tctccacagt 
cttgactcat 
ctctgttcac 
accatgcttc 
ggggaagctc 
gtctctttaa 
cctacaacca 



ccttcttccc 
agccccttac 
ctctctggca 
cccaaacgtg 
tgtagaggat 
gctactccta 
ggacggccga 
actgttatgc 
ctggcgccag 
ggtgatctat 
gattggaagc 

ggggttagcg 

agttcccggg 
gcatatcact 
gtcagtgtac 
cttcctctac 
cccaggcctc 
aaatggactg 
ggtgtcctgc 
cacagccgcc 
cagccgctag 
accagatccc 
tgtgagaaaa 

ggggtacccc 

cccccaacca 
aataacccca 
agtggggtgt 
ggagctggct 
tccactttca 
ctgacagtct 
gctgctcccc 
ctcagggaat 
cctgagggct 
ccactctggg 
tgcacagagt 
ctgcataagc 
cagccaaaaa 



cttcctagct 
tcttcaagcc 
atgttccacg 
ttgcaatccc 
gggggtatgc 
tccactgcca 
gtgcccttcc 
gccttctccc 
gctgctccct 
cttcagcgtt 
acagctgtgc 
ctgctgctgc 
aacacccttc 
ccgctaggcc 
acagagctgc 
acttttggtg 
ctggaaggtt 
ctcatgtctg 
tcgctggtgg 
ttcttcctgg 
tccctgacaa 
cctcccaggc 
gctggagaag 
taggagatgt 
agttcttcca 
tccttgttgg 
gggcaacaag 
agtccagccc 
tgcaagaagg 
cctccagttc 
acacctagcc 
gtagcccctg 
gtcttgaagc 
gcctgcccct 
gacctgagac 
aataagatct 
aaaa 



ccatgggact 
ctgactgtgg 
gcttctcctt 
tgctgcccct 
caggcctggg 
tgtacggtgc 
ggccctcctc 
ttctggtagg 
tcgcactatc 
acatggaccc 
tctactgcct 
tgatggctgc 
ccagtccccc 
tgctgctcct 
tcatgaagcg 
tgcttctgaa 
tctcaggatg 
ctgtcatgaa 
tcaacgccgt 
ccacattgct 
cttccaccct 
cttcctccct 
tgagggcagc 
gaagtgtggg 
gactaaagaa 
gcagctccct 
tggctttcct 
agccatggtg 
cccagttgcc 
cagcaatgcc 
tttgttctgg 
ggccctggct 
ccgctaccca 
gcctagcagt 
caggtacagg 
taataaagtc 



cgccccaaga 
agttggtaga 
cctgggagct 
tagccaccca 
ccgtcccagg 
ccatgcccca 
agccgtgctg 
ctggcaagca 
agccctgctc 
cagcacctac 
ctgcctccgg 
gggagcctgc 
tccagcagct 
cattctgtac 
acagcggctg 
tctaggtctg 
ggcagcactc 
gcatggcagc 
gctctcagca 
cattggcctg 
gattccggac 
ctcccatcag 
caggttattc 
tttggttaag 
ttaaggtaac 
gcfcttgtcct 
tgcctacttt 
catgactctt 
acagattata 
tagagacatg 
aaaccccaga 
taagccgaca 
ctctgaggct 
ctcccagctc 
aaacctgtag 
ttctaggctg 



ctgtggcttc 
tgcctctgat 
ggctccataa 
gggtcttgtg 
caggcccgct 
ttgctggcac 
ctgactgagc 
tggccccagg 
tatggcgcta 
caggtgctga 
caccgcctct 
tatgcagcag 
gctgccagcc 
tgcctcatct 
cccctggcac 
catgctggcg 
gtggtgctga 
agcatcacac 
gtcctgctac 
gccatgcgcc 
cctgtagatt 
cagccctgta 
tctggaggtt 
gaaatgctta 
atcaatacct 
gcatgaacag 
agtcacccag 
ccataaggga 
caaccattac 
ctccctgccc 
gagggctggg 
ctcctgacct 
cc taggaggt 
ccaacagcct 
ctcaatcagt 
tagggtggtt 



1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120. 

3180 

3204 



<210> 46 

<211> 2763 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 
<223> Incyte ID No: 



139115CB1 



<400> 46 

tgcatttgct 

ggaagaaact 

caaaagcagc 

tctgcagatg 

tattagtgat 

aaccagcgtt 

atctaccttc 

ctatatagtt 

ctttctactt 

gctaggtttt 

tattttattt 

atgtagtgaa 

tggtaagaga 

ggtaattggc 

tgaagttttt 

aggaatatgg 

taccacgatg 

agccagggtg 

aaaagtggtt 



atgactttga 
ggcaactaca 
ccaatttttg 
gacataagtg 
cactacggac 
tggctctgtt 
attggtgcat 
gatcagtgta 
ggacttgtta 
gagtggtcgt 
tttctcggag 
ggcttcaaaa 
cgatttttgc 
attgccccaa 
ataggttatg 
cttttttctt 
acaggaatgg 
ccgttccttt 
cgttcgactg 



ccggtccact 
ctttttcatc 
cattccagga 
gattaattcc 
gaaaattccc 
tgctttgcta 
tttgtggcaa 
aagaacacaa 
ctggactaac 
ttctaattat 
atccagtgaa 
acctatttta 
tctgtttgtt 
tttttatcct 
gatcagcttt 
attgtatgga 
ctatgaccgc 
tcactattgt 
aacaaggtac 



gacaacgcaa 
tgatagcaat 
ggaagttcag 
tggtctagtg 
tatgattttg 
ttttgccttt 
ttataccaca 
acaaaaaaca 
aggactgtca 
tgctgtgtct 
agagtgttca 
ccgaacttac 
actttttaca 
ttatgaattg 
gggtagtgcc 
agatattcat 
gtttgccagt 
gccattctct 
cctgtttgct 



tatgtttatc 
atttctgagt 
aaaaaagtgt 
tctacattca 
tcttccgttg 
ccattccagc 
ttttggggag 
attcgaatag 
tctggctatt 
cttgctgtta 
tctcagaatg 
atgcttttta 
gtaatcactt 
gattcaccac 
tcttttttga 
atggccttca 
acaacactga 
gttctacggt 
tgtattgctt 



ggagaatatg 
gtgaaaaaaa 
cacgttttaa 
tacttttgtc 
gtgctcttgc 
ttttgattgc 
cttgctttgc 
ctatcattga 
ttattagaga 
atttgatcta 
ttactatgtc 
agaatgcttc 
atttttttgt 
tctgctggaa 
ctagtttcct 
ttgggatttt 
tgatgttttt 
ccatgttgtc 
tcttagaaac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 
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acttggagga 
gtaccctggc 
atgtgttgtc 
atccagtgaa 
atgcacatat 
cactgagaac 
tggctttagc 
cttgcaatac 
taaaacttaa 
tctgacatat 
atcagcagaa 
tgatgaatca 
ctatacttgg 
tatatactgt 
aaatccttaa 
catcggtgac 
ataaataaaa 
gtaactaatg 
atgatccctg 
ttcaattata 
aatccgtcag 
actccttccc 
aaaaacacca 
ctcccacatt 
gccttcttga 
gcacttcgat 
gccttgagta 
eta 



gtcactgcag 
ttcactttcc 
aagtgtacca 
gatgettcag 
catataccat 
caattttacc 
ttcttgtggt 
acagaacaaa 
gggtgtaagg 
tccaggctct 
tgggtttctg 
ctcattggta 
taatgctttg 
gacttctgaa 
aagggaggc a 
aggcatgata 
ttgctctaaa 
tttggagcca 
gctgagaatt 
aatactgeta 
cacggtcact 
atctcatttc 
tgtgttctca 
acattgaaac 
catccctcct 
cccagcattt 
ctgggacaac 



tttctacttt 
tgctgtctgc 
gctggaatga 
acaggtgact 
gacttctgaa 
tatcttttct 
accacgcact 
tttcaaatac 
agggatcaag 
tacactgaga 
gcctctctca 
ggaaaataat 
ttttatagag 
gactatacat 
ctttaaagaa 
atatttctat 
gaagttaaaa 
acatttgttc 
ctgcctctag 
agggcatttt 
cataggaaaa 
ttactgeett 
tgcctccatg 
tttcaagect 
cactccccag 
tccatcgact 
ctttgattac 



taatggaatt 
tggtctgtta 
gggaagctat 
gtgatttaaa 
gactataaat 
tc taaac tga 
ttgagcactt 
gcctcacttt 
aaacttgata 
ccaaagagaa 
gggataattt 
gatataagtt 
cctgttaagc 
gaattccaca 
tatgtatttt 
atgtaatggg 
aactgaatga 
cttgtgtcag 
tctttcttac 
taaaatacga 
tgatcaaaca 
acgctcatcc 
tcttttcaca 
cagtcgaaac 
tccctacagg 
tgtaattgtt 
tcattatatc 



tactcagcca 
ctacttccag 
gaacttctta 
caaacaaaaa 
gaattccaca 
acagtcagag 
tgtgcgtatc 
tagacttaga 
aggtcaaaag 
atctttacct 
tgaaggcata 
tcaaatatgt 
tgctattgat 
ateagtgett 
tcacttttct 
taattgggaa 
acagctaata 
caaaaggata 
ccagctgttg 
tcttgtagtc 
ageaagecag 
tgaggtccac 
cactgttcca 
attgettett 
gcttccatag 
tctgctacct 
ctcaataaat 



ctgttgcttg 
ccatcagtct 
tacaagaaga 
aaatctatga 
ateagtgett 
agacagctcc 
atgeaatata 
agagaaacat 
caataatctc 
cagtttcttc 
atgaaaatta 
atgattttac 
agteggaget 
tgttgataca 
taatatgttt 
aaaatagatg 
ctggtataaa 
ttcacattcc 
tctatccttg 
cttaaatttg 
tcatgatttg 
cttggtctct 
tttgetctte 
ctggatagca 
ctctttgtgt 
gacaatcatc 
atttgttgaa 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2763 



<210> 47 
<211> 1639 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
<223> Incyte ID No: 



1702940CB1 



<400> 47 

ategcactga 

gateggggeg 

aactctggtt 

gatgtgaggc 

cagctggacg 

ggaagat ccc 

gaacgagggt 

ccctcggtaa 

accttaagta 

atgaagcctg 

teegtaaage 

ataaagacca 

ttgaggatca 

gcaccaccat 

tcctcggcct 

gcatgggtct 

taaacaaatt 

caaaggtgat 

attggtacca 

ccaaccctca 

aaggcggtga 

ecatgategt 

catatgagtc 

agegggctea 

agccaggcca 

caggecagga 

cgctttttcc 

ggggggattc 



ggcttgagtc 
ggggttgggg 
ctgactgcct 
agggtatatc 
tgggatccac 
t tggacaaga 
taaggaaaac 
cccagccgct 
tttccaggac 
gaatggattc 
tctgaacaag 
gcagcacagg 
cataaggaag 
tgccaatgtg 
gggtctggca 
gggagcagca 
gegggcaega 
gaaggagttt 
agtcacacaa 
gttaggagcg 
acaggttgag 

gggtgcagcc 

aaagcacttg 
ggagctggag 
agaccaatga 
caaaatgeag 
cccagtaggg 
gattaaege 



tgacttctct 
ggcaagegge 
gagacatggg 
tgggaggccg 
acagctcaga 
ggaccctgcc 
cttccagtct 
ggcaccatga 
caagtgagca 
gtggctgctg 
ettgeaagtc 
cagtggtttt 
ctccgtgccc 
gtgtccaact 
cccttcacag 
gctgctgtgg 
gcccaagccc 
gtgggtggga 
gggattggga 
tatgccccac 
agggttgttg 
actggaggca 
cttgaggggg 
gggaagc t ca 
ccccagagca 
actttttttt 
gtggcggggc 



cccccacctg 
tcagatgggt 
cagctgacac 
gaggacgtgt 
acagttggat 
ttggtgtgag 
ggacagtgac 
acccagagag 
gagagaatct 
ctgaactgcc 
acatggtcat 
tgaaagagtt 
ttgcagagga 
ctgttggcac 
aaggaatcag 
ctgggattac 
gcaacttgga 
acacacccaa 
ggaacatccg 
ccccgcatgt 
aaggccccgc 
tettgettet 
caaagtcaga 
actttctcac 
gtgcagccac 
ttttcaagtc 
ccaactctgg 



ctgtgccctt 
tcaaaaaact 
agcagacctt 
ctggttatta 
ettgetcagt 
agtgagggta 
tggagagctc 
cagtatcttt 
gctacaactg 
cagggatgag 
gaaggacaaa 
tcctcggttg 
ggttgagcag 
tacctctggc 
ttttgtgctc 
ctgcagtgtg 
ccaaagcggc 
tgttcttacc 
tgecatcaga 
cattgggega 
ccaggcaatg 
gctggatgtg 
gtcagctgag 
caagatccat 
cagggcagaa 
tttgacgggg 
gccgtgtgaa 



aaactgeaga 
ccccaggctc 
gaatcctgag 
cacagatgea 
ctctgtcaga 
gaggaagctg 
caaggaaagc 
attgaggatt 
ctgactgatg 
gcagatgagc 
aaccgccacg 
aaaagggagc 
gtccacagag 
atcctgaccc 
ttggacactg 
gtagaactag 
accaatgtag 
ttagttgaca 
egagecagag 
atctcagctg 
agcagaggaa 
gtcagccttg 
gagctgaaga 
gagatgetge 
atgeegggea 
aagggagctc 
cctcccgggg 



60 

120 

180 . 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1639 
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<210> 48 
<211> 1600 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 1703342CB1 



<400> 48 

caaggcggcc caggacaggc aggggctgca 
caagcccctt gccttgggtc acacagccaa 
ccagaggcaa cagggacatg gccacctggg 
tggctcccgc tgagaggatg agcaagttct 
accatgcctg gaacatcaac tacaagaaat 
agcagccacc acccacacca gtctcaggcg 
cccctgcccc tggccccgca cccagggccc 
tgttcagctc ccacaggttt caggtcatca 
tggtgcttgc tgagctcatc ctggacctga 
ctgccatggt attccactac atgagcatca 
tctttaaatt atttgtcttc cgcctggagt 
ccgtcgtggt ggtggtctca ttcatcctcg 
ttgaggctct gggcctgctg attctgctcc 
ggattatcat ctcagttaag acacgttcag 
atgtacaatt ggccgccaag attcaacacc 
aaattgaaag acttaacaaa ctattgcgac 
cccggaccag ctcccctcaa aaagaagaca 
gaacagctgc ccctcctggg ccgcttggtg 
cctgccagca tggattctgg gtggacacag 
tgcccatcca ctcccacccc acactgtatc 
tttagcctta attgaaaatg agcaacaaag 
aatctcaccg aatgtacagt tttcaaattt 
agcattctga aagaaagaaa aagaagctac 
cttattttca agctgttcta aatagcttcg 
ccctcctctg tgtgccccag ctgcatcagc 
cacctgacaa catttttcct caattactgt 
gtataaaata aactctctct tttccctgga 

<210> 49 
<211> 2380 
<212> DNA 

<213> Homo sapiens 



cgcggtgaag aaaccaagac gcagagaggc 60 
aggaggcaga gccagaactc acaaccagat 120 
acgaaaaggc agtcacccgc agggccaagg 180 
taaggcactt cacggtcgtg ggagacgact 240 
gggagaatga agaggaggag gaggaggagg 3 00 
aggaaggcag agctgcagcc cctgacgttg 3 60 
cccttgactt caggggcatg ttgaggaaac 420 
tcatctgctt ggtggttctg gatgccctcc 480 
agatcatcca gcccgacaag aataactatg 540 
ccatcttggt cttttttatg atggagatca 600 
tctttcacca caagtttgag atcctggatg 660 
acattgtcct cctgttccag gagcaccagt 72 0 
ggctgtggcg ggtggcccgg atcatcaatg 780 
aacggcaact cttaaggtta aaacagatga 840 
ttgagttcag ctgctctgag aaggaacaag 9 00 
agcatggact tcttggtgaa gtgaactaga 9 60 
ctgtctcatg ggcctgtgct gtcacgagag 1020 
agaggtttgg tttgatacct ctgcctccct 1080 
ccttgtggaa ggtccagtac caccaagagc 1140 
aaatgtatca cattttctca tgttgaacac 1200 
ctggacaatt gctagttgta tataaaattt 1260 
cacgtgtata ttaaggaact gatgcatctg 1320 
tttagctgcc accccattct agaaaagtct 13 80 
tctcagtttc cccaaaaggg gtacccaggc 1440 
cagcttctag gtggctccat tgttttctgc 1500 
acaactactg tataaaataa aacaactact 1560 
aaaaaaaaaa 1600 



<220> 

<221> misc_feature 

<223> Incyte ID No: 1727529CB1 



<400> 49 

ctgagccatg 

atacgacccc 

cgtcctcttc 

tggagacccc 

ggagaacaaa 

caacatcatc 

ctcctgcccg 

agtcttctat 

gatcacaagc 

gggacgctgc 

caccaccata 

tgttaagatc 

ggctctggtc 

gctggtgctg 

ggagtaccga 

cctcagtgcc 

gcttgaagcc 

cgccctcctg 

actggtcacc 



gggggaaagc 
tcctttcgag 
ctgctcttca 
cggcaagtcc 
gataagccgt 
tcagttgctg 
gaggacccat 
acaaaaaaca 
ctgcaacagg 
tttccatgga 
cagcagggga 
tttgaagatt 
ttgagcctac 
atcctgggag 
gtgctgcggg 
taccagagcg 
atcctgctgc 
aaggaggcca 
tttgtcctcc 



agcgggacga 
gccccatcaa 
ttctaggtta 
tctaccccag 
atctcctgta 
agaacggcct 
ggactgtggg 
ggaacttttg 
aactctgccc 
ccaacattac 
tcagcggtct 
ttgcccagtc 
tgtttatctt 
tgctgggcgt 
acaagggcgc 
tgcaggagac 
tggtgctcat 
gcaaggctgt 
tcctcatctg 



ggatgacgag 
gaacagaagc 
catcgtggtg 
gaactctact 
cttcaacatc 
acagtgcccc 
aaaaaacgag 
tctgccaggg 
cagtttcctc 
tccaccggcg 
tattgacagc 
ctggtattgg 
gcttctgcgc 
gctggcatac 
ctccatctcc 
ctggctggcc 
cttcctgcgg 
gggacagatg 
cattgcctac 



gcctacggga 
tgcacagatg 
gggattgtgg 

ggggcctact 

ttcagctgca 
acaccccagg 
ttctcacaga 
gtaccctgga 
ctcccctctg 
ctcccaggga 
ctcaatgccc 
attcttgttg 
ctggtggctg 
ggcatctact 
cagctgggtt 
gccctgatcg 
cagcggattc 
atgtctacca 
tgggccatga 



agccagtcaa 
tcatctgctg 
cctggttgta 
gtggcatggg 
tcctgtccag 
tgtgtgtgtc 
ctgttgggga 
atatgacggt 
ctccagctct 
tcaccaatga 
gagacatcag 
ccctgggggt 
ggcccctggt 
actgctggga 
tcaccaccaa 
tgttggcggt 
gtattgccat 
tgttctaccc 
ctgctctgta 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 
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cctggctaca tcggggcaac 
ctgtgagaaa gtgccaataa 
gtgcccaggg ctgatgtgcg 
tgtcttcaat ctgcaaatct 
ggccctgggc caatgcgtcc 
gccccaggac atccctacct 
cactgggtca ttggcatttg 
cttggagtat attgaccaca 
gtgctgtttc aagtgctgcc 
tgcatacatc atgatcgcca 
catgctactc atgcgaaaca 
gctgttcttt gggaagctgc 
ctccggtcgc atcccggggc 
gctgcccatc atgacctcca 
tttcggcatg tgtgtggaca 
cggctccctg gaccggccct 
gaacgaggcg cccccggaca 
gactgcaccc cacccccacc 
tttgtggtaa aaaaaggttt 
tgagaggctg aggcgggcgg 
ggtgaaacct ccgtctctat 

<210> 50 
<211> 3038 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> mis cofeature 
<223> Incyte ID No; 2289333CB1 

<400> 50 

aggggcaggg aggcgggcac caggcgcggg tccctccggg caggcgaggt aggcctgggc 60 
ctgacgccgg ccacgcagcg gcgggagagt gagcactcgg gcggcggcgt cctggagacc 120 
cgcgagagat ggaagcggcg gcgacgccgg cggctgccgg ggcggcgagg cgcgaggagc 180 
tagatatgga tgtaatgagg cccttgataa atgagcagaa ttttgatggg acatcagatg 240 
aagaacatga gcaagagctt ctgcctgttc agaagcatta ccaacttgat gatcaagagg 3 00 
gcatttcatt tgtacaaact cttatgcacc ttcttaaagg aaatattgga actggccttt 3 60 
taggacttcc attggcaata aaaaatgcag gcatagtgct tggaccaatc agccttgtgt 420 
ttataggaat tatttctgtt cactgtatgc acatattggt acgttgcagt cactttctat 480 
gtctgaggtt taaaaagtca acattaggtt atagtgacac tgtgagcttt gctatggaag 540 
tgagtccttg gagttgtctt cagaagcaag cagcatgggg gcggagtgtg gttgactttt 600 
ttctggtgat aacacagctg ggattctgta gtgtttatat tgtcttctta gctgaaaatg 660 
tgaaacaagt tcatgaagga ttcctggaga gtaaagtgtt tatttcaaat agtaccaatt 720 
catcaaaccc ttgtgagaga agaagtgttg acctaaggat atatatgctt tgctttcttc 780 
catttataat tcttttggtc ttcattcgtg aactaaagaa tctatttgta ctttcattcc 840 
ttgccaacgt ttccatggct gtcagtcttg tgataattta ccagtatgtt gtcaggaaca 900 
tgccagatcc ccacaacctt ccaatagtgg ctggttggaa gaaataccca ctcttttttg .960 
gtactgctgt atttgctttt gaaggcatag gagtggtcct tccactggaa aaccaaatga 1020 
aagaatcaaa gcgtttccct caagcgttga atattggcat ggggattgtt acaactttgt 1080 
atgtaacatt agctacttta ggatatatgt gtttccatga tgaaatcaaa ggcagcataa 114 0 
ctttaaatct tccccaagat gtatggttat atcaatcagt gaaaattcta tattcctttg 1200 
gcatttttgt gacatattca attcagttct atgttccagc agagatcatt atccctggga 1260 
tcacatccaa atttcatact aaatggaagc aaatctgtga atttgggata agatccttct 1320 
tggttagtat tacttgtgcc ggagcaattc ttattcctcg tttagacatt gtgatttcct 1380 
tcgttggagc tgtgagcagc agcacattgg ccctaatcct gccacctttg gttgaaattc 144 0 
ttacattttc gaaggaacat tataatatat ggatggtcct gaaaaatatt tctatagcat 150 0 
tcactggagt tgttggcttc ttattaggta catatataac tgttgaagaa attatttatc 1560 
ctactcccaa agttgtagct ggcactccac agagtccttt tctaaatttg aattcaacat 162 0 
gcttaacatc tggtttgaaa tagtaaaagc agaatcatga gtcttctatt tttgtcccat 168 0 
ttctgaaaat tatcaagata actagtaaaa tacattgcta tatacataaa aatggtaaca 174 0 
aactctgttt tctttggcac gatattaata ttttggaagt aatcataact ctttaccagt 1800 
agtggtaaac ctatgaaaaa tccttgcttt taagtgttag caatagttca aaaaattaag 1860 
ttctgaaaat tgaaaaaatt aaaatgtaaa aaaattaaag aataaaaata cttctattat 192 0 
tcttttatct cagtaagaaa taccttaacc aagatatctc tcttttatgc tactcttttg 1980 
ccactcactt gagaacagaa taggatttca acaataagag aataaaataa gaacatgtat 2040 
aacaaaaagc tctctccaga tcatccctgt gaatgccaaa gtaaacttta tgtacagtgt 2100 



cccagtatgt gctctgggca 
atacatcatg caaccccacg 
tcttccaggg ctactcatcc 
atggggtcct ggggctcttc 
tcgctggagc ctttgcctcc 
tccccttaat ctctgccttc 
gagccctcat cctgaccctt 
agctcagagg agtgcagaac 
tctggtgtct ggaaaaattt 
tctacgggaa gaatttctgt 
ttgtcagggt ggtcgtcctg 
tggtggtcgg aggcgtgggg 
tgggtaaaga ctttaagagc 
tcctgggggc ctatgtcatc 
cgctcttcct ctgcttcctg 
actacatgtc caagagcctt 
acaagaagag gaagaagtga 
gtccagccat ccaacctcac 
taggccaggc gccgtggctc 
atcacctgag tcaggagttc 
taaaaataca aaaattagcc 



tccaacatca gctcccccgg 120 0 
gcccaccttg tgaactcctc 12 6 0 
aaaggcctaa tccaacgttc 132 0 
tggaccctta actgggtact 13 8 0 
ttctactggg ccttccacaa 144 0 
atccgcacac tccgttacca 1500 
gtgcagatag cccgggtcat 1560 
cctgtagccc gctgcatcat 162 0 
atcaagttcc taaaccgcaa 168 0 
gtctcagcca aaaatgcgtt 174 0 
gacaaagtca cagacctgct 180 0 
gtcctgtcct tctttttttt 1860 
ccccacctca actattactg 192 0 
gccagcggct tcttcagcgt 198 0 
gaagacctgg agcggaacaa 204 0 
ctaaagattc tgggcaagaa 2100 
cagctccggc cctgatccag 2160 
ttcgccttac aggtctccat 222 0 
acgcctgtaa tccaacactt 2280 
gagaccagcc tggccaacat 2340 

2380 
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aaaaaaaaaa 
gtatagaaaa 
ttttacactc 
tagtgtttca 
tttcgtcaga 
ttgagatcaa 
atgtccaaga 
atattttaat 
aaatcaggga 
gctatgttct 
tgtgtgtcta 
agttttgtag 
ttgtggactc 
gaataaaaga 
taagttggga 
gacctcgact 



aatctcagtt 
ctcccattaa 
taatggtagt 
atgaaatttg 
ggaaaaggag 
gaaaacctaa 
aaaagaataa 
actttgaaaa 
actccaagaa 
tttaaaacag 
gctatactat 
gcaccacatt 
tttccagccc 
atcagctagg 
ggatcacttg 
ctataaaaca 



atgtttttat 
cataatataa 
tgatcttcat 
acaagggact 
gcctagaaag 
tcttctgact 
attatgttca 
tgaatgtgtg 
gcctacactg 
aatagagacc 
ttgtggcttg 
cctgaatggc 
tgtggctttt 
tgtggtggtc 
agaggccagg 
taaaaaaaaa 



tagccaaatt 
gcatcagaaa 
agtcaagagg 
ttaaaactta 
gttaagtaac 
cccaggccag 
gcttaatttt 
atttttaata 
tggccatata 
gcttgctggt 
agctttttta 
agaaaataga 
cttatcacag 
tgtgcttata 
agcttgagac 
aaaaaaaa 



ctaatgattg 
attgcaaaca 
cactgttcaa 
tccagtgcaa 
ttggtcgaga 
gatgttttat 
agtgttgaat 
gtatatgtga 
aacctcagca 
gaaactcctg 
attattacct 
cacctcagaa 
ccttttattt 
atcccagcta 
cagcctgggc 



gctcctggaa 
ctagaattaa 
gatcatgact 
ctcccttgtt 
ccactcagcc 
ttctcacatc 
ctatttgatt 
cctgagcaga 
agagaaagaa 
gctagtaaga 
tcctttcctg 
aacggaggat 
attatgagca 
ctctggagga 
agcatagtga 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3038 



<210> 51 

<211> 2608 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 
<223> Incyte ID No: 



2720354CB1 



<400> 51 

taggctaatt 

tgacctcaag 

actgcacctg 

ggacattagg 

tgacaagcat 

tgcactgtgc 

gaccccctac 

cctgagaagg 

gcctgagtgc 

gctgggtgct 

ttcaggtcca 

tcatcaagca 

tcaccttcat 

acgactcgcc 

tctgctgccc 

gcacctacaa 

gcgtcaaccg 

tcctcaccta 

tgcccaagct 

ctgtggacgt 

gcggcatcct 

gggggctggc 

tcacggtggt 

ccgccgcccc 

ctccttcccc 

ccctgcttcc 

ctgatcagct 

ataaggaccc 

gccagcacct 

ccagcccctt 

gggggtctct 

acccgggggg 

tgcctcctgg 

ctaggggtac 

ccgggaagcc 

caaacaggcc 

aatccttccg 

cggacaactg 

agggctcagc 

aagaagccaa 



ttttttacag 
tgatccaccc 
gccaggctca 
gcaggggccc 
cacctggaga 
agccatagtg 
cctgtctcca 
catagatacc 
ccggcctcct 
ctggtcctgc 
gagcgtggag 
agagagcgtg 
caacgcgctg 
cctcaaccag 
catggagctg 
gggctcgctg 
gggcatggtg 
tgacgctctc 
gctgttggcg 
ggtcaagtcg 
ggactgcgtg 
gtccacgctg 
gctcacctac 
tgcggggcct 
agggctcctt 
tgctgggatg 
gggtagtttt 
tcatgcccac 
ggcctggctg 
tctgggatcc 
gatggccaac 
gtggtgggcc 
tgtcacagaa 
cccagctgct 
atggagtctg 
agcggtggag 
cctctgggcc 
agtctccggc 
agagaacagt 
aggtcagtcc 



acacgatttc 
acctcagcct 
tcactttttg 
ttcagcctag 
acgggccagt 
taaacaagag 
gcagcctgtc 
cggcccaccc 
tgggagaccc 
ctgcctgcct 
aagcctcagt 
ctgggcctgt 
gtgttcgggg 
ttcctggcag 
gccaagacgc 
gactgcctcg 
tccacgttgc 
acgcgggcgc 
ggcggtacgt 
cggctgcagg 
caccagagct 
ctgcgcgcct 
gcgcgcggcg 
gccctggcgc 
ctcagaaacc 
ctgcgagctg 
ggccgagaac 
actgtagagt 
aggccattgc 
tggccacgtc 
caaggggcca 
acccctctgg 
ctggatcctc 
gccactcctg 
ctggaggcac 
gtctggacag 
tcaggctgcc 
ccacctacca 
atgggacccc 
ctctgctccc 



gccacgttgg 
cccaaagtgt 
cgcctattgc 
tctgggacat 
ctcaggaggt 
gcttaacctg 
cctttagctg 
tgccctggaa 
tcctaggcag 
ctcactggac 
accgcgggac 
acaagggcct 
tgcagggcaa 
gtgcggcggc 
ggctgcagct 
cgcagatcta 
tgcgtgagac 
tgggctgcga 
caggcatcgt 
cggacggact 
accgcgccga 
tccccgtcaa 
aggaggccgg 
agccctccag 
tgggacataa 
tggagtctat 
tgcacttgcc 
cacgaagctc 
accgttatcc 
attgtgctcc 
cccagggacc 
tctgtgttag 
tgcatacccc 
ctggagggtg 
catatcagcc 
ttcaagtgtg 
tgtccataaa 
ccgccagcca 
ctcaccaggc 
cagcaaacgg 



ccaggctggt 
tgggattaca 
ctcgaagcca 
gggccgctca 
cgttcatgcc 
aactggtctg 
tttgcctact 
ttacaaaagt 
cctaagcacc 
cctctccttc 
gttgcactgc 
gggctcgccg 
caccctccgg 
gggcgccatc 
gcaggacgcg 
cgggcacgag 
tcccagcttc 
gccgggcgac 
gtcctggctc 
gcggggcgcc 
gggctggcgc 
cgctgccacc 
gcccgagggc 
cctgtgacgc 
attggcccct 
cagatgtggg 
tcagtgttct 
agagattatt 
tggaaactga 
tgccctgcag 
tctaactcca 
ggacagagga 
agcttctcca 
aactggggac 
tgcgggacta 
atgcagctgt 
atggggacat 
ggatccccca 
ctggaacacc 
tgcctcccag 



cttgaactcc 
ggcgtgagcc 
gtctctgatg 
ctcagcagta 
ccactggcag 
agate ttggg 
ggcaccccat 
cttagactgt 
agacceggga 
caggtaegge 
ttcaagtcca 
ctcatggggc 
gccctgggcc 
cagtgegtea 
ggcccagcgc 
ggtctgcgtg 
ggegtctact 
cgcctgctgg 
tctacctatc 
ccgcgctacc 
gtcttcacac 
ttcgccaccg 
gaggctgtgc 
tcaccccgcc 
gagtcgattg 
ctgaattttg 
catctatgaa 
cccagcagca 
ggcagacact 
gctggctccc 
cacatcctcc 
aaacttggtg 
catgccactg 
cctgcaccct 

gggtggggag 
ggcaaggaga 
ggccagctga 
aagtgtgcag 
tccagccaca 
gcattctcag 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 . 

720 

780 

840 

900 

960 

1020 

1080 

li40 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 
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tgccagggct tcatccctgt gaaggcacag ggcctgctag tgggcacagg ggtggctagt 2460 
tggggcctgg ggcagaggag ggctgcacca ggcgtcctgg ggaatgtgct cagtgaagac 2520 
gacactgggc tttgcacagc ctggtgtcgc tgtacagaaa ctgtcaaggg aataaagtgt 2580 
tctttgtttt ttaaaaaaaa aaaaaaaa 2608 

<210> 52 

<211> 3804 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 3038193CB1 

<400> 52 

ccctttcctg tcactggcta ctaccactcc caaccctcct caaagccgcc ggagcaaccc 60 
ccaggtcttt actttacaat cggcaatttg acttgctctg ctgcatgtct ggagggacca 12 0 
aggaaagtgt ggagacgctc caaggattag gtgatcggag cttgaaaaga aaaaaagcca 180 
aacaaataaa caaaacccac ccaccctaac aaatatgagg ctgctggaga gaatgaggaa 240 
agactggttc atggtcggaa tagtgctggc gatcgctgga gctaaactgg agccgtccat 3 00 
aggggtgaat gggggaccac tgaagccaga aataactgta tcctacattg ctgttgcaac 3 60 
aatattcttt aacagtggac tatcattgaa aacagaggag ctgaccagtg ctttggtgca 420 
tctaaaactg catcttttta ttcagatctt tactcttgca ttcttcccag caacaatatg 480 
gctttttctt cagcttttat caatcacacc catcaacgaa tggcttttaa aaggtttgca 540 
gacagtaggt tgcatgcctc cgcctgtgtc ttctgcagtg attttaacca aggcagttgg 600 
tggaaatgag ggcatcgtta taacacccct gctcctgctg ctttttcttg gttcatcttc 660 
ttctgtgcct ttcacatcta ttttttctca gctttttatg actgttgtgg ttcctctcat 720 
cattggacag attgtccgaa gatacatcaa ggattggctt gagagaaaga agcctccttt 780 
tggtgctatc agcagcagtg tactcctcat gatcatctac acaacattct gtgacacgtt 840 
ctctaaccca aatattgacc tggataaatt cagccttgtt ctcatactgt tcataatatt 900 
ttctatccag ctgagtttta tgcttttaac tttcatcttt tcaacaagga ataattcggg 960 
tttcacacca gcagacacag tggctatcat tttctgttct acacacaaat cccttacatt 102 0 
gggaattccg atgctgaaga tcgtgtttgc aggctatgag catctctctt taatatctgt 1080 
acccttgctc atctaccacc cagctcagat ccttctggga agtgtgttgg tgccaacaat 1140 
caagtcttgg atggtatcaa ggcagaagaa actactccaa accagggggc cactggctaa 12 00 
cttgaataat ccagaaggct tggaatatct atccatcaaa tttgggcatt aaaataaata 12 60 
ccaagagtcc atcctccagg gagtgaagct gacaaggccg acagtataac aaaggaggtg 1320 
gactttctgt agcaatgtat atatgtacag gattgtacat actagcaatt ctgaagactt 13 80 
gtacttgtga atgttgcctc aatgcatatt ttattttttt acacaaaaat atgagatcct 1440 
gtttaagtgc cttaaaatgt atttgacaag agcgttattt ccacaatatg ctttgttgat 150.0 
tactgccagg ggtggtacaa tatttggggg ttaattttgc tttcctaatg caggaatcag 1560 
tcatggtaag tgacaaaaag caaacatgct ttccctgcag cacctttgtg taatacaacc 1620 
ctatagtagt tactgtaatg tttgaaatga ggtcacacca tcaggaaaat gcccttctga 1680 
tgacagtgaa aatttccaaa gtcttattca tgcatacttt gatttactgt gtgattcttt 1740 
ttttctacga ctgtgacatg cctcttcctt atcaactcag caggggtcat agatcgaata 1800 
gatgctgaaa agcgtaagat atatgcattc cttgacatca tttttaaaga cattccttca 1860 
aatagtttcc acacagaaat tcctcactcc cattatgaga gattgtggtt atatgtctta 1920 
aatttattat aagctgcttc aaagaaaggg tctgaatgtt tgaattatga gtgaaatcat 1980 
gtgaaatttt gagttaaact ctgtgatttg attttcaggg tctttaaaat atatcttaat 2040 
atcttcttcc tctttattca ataatttctg tcttgcactt acacactcat aacagccaaa 2100 
tatgaggcac aaaaatgtta caatcagttt gaaagcagca tcaattaatg gtagattcta 2160 
ttcacattcc acaacccaga ccaaattttt ttcctattac gcagatgtgc tgagcacttt 2220 
ccagattgcc cctgttggcc aaaagcagcc tgttacatcc tggaattaag cacacttaag 2280 
gtatttgaga caatttatta atgaaaattt ccttggcaga tttgacaaat gttggcaata 2340 
tttttttaaa agttaaatca tattgctttc atgaataaat gaaaatataa aggtcatgga 2400 
tgcaaacaaa tgttacatat acacattctg tctctccaga tgaaaagaac atgcaaaacc 2460 
atttaataac caaaatatca agtaaaatta gttcccaacg gggcagcagc tttcaaatga 2520 
gtgtccaata tttgcttctg ctatagctgc aagaactgta actggaccca agtagagaat 2580 
gaagccacgt atagaactac gagaacactt ttctgtgttt cccccatgcc gtcctgtcac 2 640 
atcctcttac acgtcctctc ttgatttgat agacaatatt ggcatcctgg gtctcactga 27 00 
ggccgtgcta tgtcctcagc agctgttttt gttgtttcgt tattatgccc acaacaaaaa 27 60 
atcattcctt agaaactcac caagtttatc tactgtgtaa atttatatta ttgttactac 2820 
caggtctcat cttttgtcaa tgtcattgaa taaatttcat aagagttatt ctcagtgtga 2880 
attttaaggo taatgccaga tcctgcaaaa atctatgcta accaggctgt agtacacact 2940 
gttataaaga attttacttg tgtctaaaac tacagtaatt ttgcttaggt aattgtgctt 3 000 
acctatggag cacaggaagg ctcttaggtt ttgttcctac aagtttcttt gaattttgga 3060 
gtaaatggaa gtgtctgtct gtctgtcatc tatctgccct atcataaaaa tctttctccc 3120 
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taacattaaa 
gtaggtgtga 
ctttaaaaat 
gatatgttta 
aacaaaaaat 
agcttaaaat 
tgtctaaatg 
acctggattt 
tattttcatg 
tctgttaaaa 
tttcaaaagg 
aattatatga 



atactgatcc 
tcatgggata 
gtgctttgtt 
tgcctatttc 
aaaatcaaaa 
attttgttat 
tatttatgaa 
tttttaatgg 
ttttttctta 
attaacacac 
attggttaag 
ttaaacaaag 



ccgcccccaa 
aaattcaact 
ttcaaataat 
ttttttttac 
tttattttta 
gtttatacac 
agaaatacat 
ttgttacaaa 
gtattaaaat 
ctctagctaa 
tcataaagtg 
aatg 



cttatctacc 
gaaaatgcta 
ctttacatag 
acaaattcct 
attcatgctt 
tgtaaagcta 
tagattatat 
attagatttt 
ttttgtgggt 
tgttcagtgt 
gattatttat 



tctattgtct 
tgataacatt 
tgaactttgg 
tggcatattt 
attgggattt 
tctgttttat 
ttatgtttac 
ttaatgggta 
tttttaaaat 
ttgtgctaaa 
gatgactgga 



aacacctata 
ttatcgtttg 
tggcgttagt 
tttcataaag 
aattattcag 
gcatttgttt 
tcatttttcc 
ataatgttgg 
ttttccctat 
taccaaattt 
agatgaaaat 



3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3804 



<210> 53 
<211> 1894 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> nusc_f eature 

<223> Incyte ID No: 3460979CB1 



<220> 

<221> unsure 
<222> 1651 
<223> a, t, c, 



g, or other 



<400> 53 

acggatcact 

cccgtcggct 

atgctgggtt 

ctcaaggatg 

tacatggtgg 

attggattta 

gtgaattact 

tccagaagtg 

atcttcagca 

tcagcaggca 

gattcaaagg 

tttcagtttt 

cataaatgcc 

atcacagaga 

tacatctcaa 

aaacgacgga 

tctctttcct 

atcgaaggct 

atcaccattg 

gacaaaaaga 

atcatagagt 

ctggtcgacc 

ttacaagaag 

ttcagacatt 

ctcctcaaac 

gccacactgg 

tacctacaac 

cttccctctt 

gccggctggt 

accggcactg 

gctgtgccgg 

ggcgaacaag 



agtatgcggc 
cccccgcctc 
tgctgcagtt 
atgtgaggca 
tgaatgtcag 
gcctagaccg 
gtattttaaa 
aggtaagagt 
gggatgagaa 
accagaccca 
ccatgggaga 
tctttaacat 
ttggaaaaga 
agaatcctga 
tggccttttt 
atgatgtatt 
tggtgttcca 
gggctgttgt 
cactcattgg 
tcttcatgat 
ccaccgagga 
tgttgtgttg 
catcagcaac 
attacgtctt 
tcgctgttcc 
tcttctttgt 
tttctcagga 
ccttagccct 
ctcagcattt 
gagcccaagg 
aaaaggaaag 
agaaaaagcg 



gcagtgtgct 
ccgcggtcct 
gctggccgag 
taaagttcat 
tagcctctca 
tacaaagaat 
gaaacagtct 
aaagtctcca 
agtccttggt 
gaagacacaa 
gaaatccttt 
cagcactgat 
attgccaagt 
cagctacctc 
cttctttctt 
taaaatccac 
tgcaattgac 
gtactacata 
cactggctgg 
tgtcattcca 
gggcacgact 
tggtgccatc 
agatggaaaa 
gattgtgtgt 
attccagtgg 
tctaacgggg 
agaagaagac 
gaaccctttg 
cgtggctgca 
ggtcggtctg 

ggggggccaa 

ggcccaggag 



ggaaagggaa 
aggctggccg 
cctggcctgg 
ctgaacacct 
ctgaatgagc 
gatggctttt 
gtctctgtca 
ccagaagctg 
cagagccagg 
gatggtggaa 
tctgttcata 
gaccaagaag 
gacaagttta 
tcagcaggag 
tctgggacca 
tggctgatgg 
taccactaca 
actcaccttt 
gctttcatta 
ctccaggtcc 
gaatatggct 
ctcttcccag 
gc tgctatta 
tacatatact 
aagtggctct 
tataaattcc 
ttggaaatgg 
nctaacacaa 
ggggtgggtc 
gttgaaggca 
aaaacaattg 
aaag 



caaacatggc 
cgggcctccg 
gccgcgtcca 
ttggcttctt 
ctgaagacaa 
cttcttacct 
cccttttaat 
gtacccagtt 
agcctaatgt 
agtctaaaag 
ataatggtgg 
gcctttacag 
cattcagcct 
aaattcctct 
tctggattca 
cggcccttcc 
tctcctccca 
tgaaaggggc 
agcacatcct 
tggcaaatgt 
tgtggaagga 
tggtgtggtc 
acttagcaaa 
tcactaggat 
accagctcct 
gtccggcttc 
agtccgtgta 
agcagcacag 
ctctatattt 
agatttggca 
gggccggcgt 



cgctctggcg 
gctgctccca 
tcacctggca 
caaggatggg 
ggatgtgact 
ggatgaagat 
cctagacatc 
accaaagatc 
taaccctgct 
aagtacagtg 
ggcagtgtca 
tctttatttt 
tgatattgag 
ccccaaatta 
tatccttcga 
tttcaccaag 
gggcttccct 
gctactcttc 
ttctgataaa 
agcctacatc 
ctctctattt 
aatcagacat 
gctgaaactt 
cattgcattt 
ggatgaaacg 
agataacccc 
agaaatcttt 
tgtgaatcga 
agcagaaggg 
accatactgg 
caaaaaaccg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660. 

720. 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1894 



<210> 54 
<211> 1668 
<212> DNA 

<213> Homo sapiens 



<220> 
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<221> misc_f eature 

<223> Incyte ID No: 7472200CB1 

<400> 54 

atgacactgg tttactttcc tccttcaaag 
agtcgcctgg cccaacagtt ggcccaatcc 
cggaccacta tccacggcct ggacaggctg 
ttcgtctggc tgtgcacctt tgtgagtgcc 
ctctccgccc gctacaacgc cgcccacttc 
gtttaccgca taccatttcc ggtcataacg 
cgcctggcgg aggcgaagtc aagattcctg 
ctcttcgagc tgattgtggg cacctacgac 
gagcgattgc gcaaccagcc aacggagctg 
gattttatga cctggcgctg caacgagctg 
tacgactgct gcgagatccg ctcgaagcgg 
aactcgctgg agacggaaga gggcaggcgg 
cgtactgggt cggcgggtcc catgagcgcc 
aagcactggc cggggcacag ggagacgaat 
gagccatttg tgtggcacaa caatccgttc 
gagatcgaac ccgtcatcta cttctatgac 
cgccagtgcg tcttcgatga tgagcacaac 
gtttacatga ttgaaaactg tcagtccgag 
aactgcacaa tggacctact gtttccaccg 
gagaaggagt tcgttcgcaa ccaatttcag 
tactccctca actacatcag cgatgtccgg 
aacaactcct atgtggacct ggatgtgcac 
accagcctcg tcttcggctg ggtggactta 
tttcttggct gctccctaat tagtggcatg 
ccggcctttg ggctggatgg actgcgtcga 
ggcgtaaccg tgcccacgcc cactttgaac 
gagaactaca ttatgcaact gaaggctgag 
aactggcacc gcataacatt tgctcaaaag 



cttcagcagc agcagcagcc atcgagatcc 60 
tcctggcagc tggccctgcg ctttggcaaa 120 
cttagtgcca aggccagtcg atgggagcga 180 
ttcctgggcg cggtgtacgt ttgcctgatt 240 
cagacggtgg tggatagcac gcggtttccg 3 00 
atctgcaacc ggaatcgcct caactggcaa 3 60 
gccaacggca gcaactccgc ccagcaggag 42 0 
gatgcttact tcggtcactt tcagtccttc 480 
ctcaactatg tcaatttcag ccaggtggtg 540 
ctcgcggaat gcctgtggcg ccaccatgcc 600 
cgcagcaaga acggcttgtg ctgggctttc 660 
atgcagctgc tcgatcccat gtggccctgg 720 
ctctccgtgc gtgttctcat ccagcccgcg 780 
gccatgaagg gcatcgatgt catggttacc 840 
ttcgtggccg cgaacacgga gacgaccatg 9 00 
aacgacaccc ggggagttcg ctccgaccag 9 60 
agcaaggatt tcaagtcgct gcaaggatac 102 0 
tgccatcagg agtacttggt gcgctattgc 1080 
gacctgctca tctactccca caatcccggc 1140 
ggaatgtcct gcaagtgctt ccgcaactgc 1200 
cccgccttcc tgccaccgga tgtgtacgca 1260 
tttcgcttcg agaccattat ggtctatcgc 1320 
atggttagct ttggaggaat tgccggtctt 13 80 
gaactggcct atttcctgtg cattgaggtg 1440 
aggtggaagg ctcgacggca gatggatctg 1500 
tttcaacaaa ccacgcccag tcagctgatg 1560 
aaggcgcaac agcagaaggc gaactttcaa 162 0 
catgttattg gcaagtga 1668 
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